天地's repositories
chinese-segmentation-evaluation
中文分词工具评估
awesome-chatgpt-prompts-zh
ChatGPT 中文调教指南。各种场景使用指南。学习怎么让它听你的话。
BERT-BiLSTM-CRF-NER
Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning And private server services
BERT-NER-Pytorch
Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)
bert4keras
keras implement of transformers for humans
Chinese-Names-Corpus
中文人名语料库。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。成语词典。
chinese-xinhua
:orange_book: 中华新华字典数据库。包括歇后语,成语,词语,汉字。
clash-for-linux
clash-for-linux
GithubMonitor
根据关键字与 hosts 生成的关键词,利用 github 提供的 api,监控 git 泄漏。
HanLP
中文分词 词性标注 命名实体识别 依存句法分析 语义依存分析 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
Enterprise-Registration-Data-of-Chinese-Mainland
**大陆 31 个省份1978 年至 2019 年一千多万工商企业注册信息,包含企业名称、注册地址、统一社会信用代码、地区、注册日期、经营范围、法人代表、注册资金、企业类型等详细资料。This repository is an dataset of over 10,000,000 enterprise registration data of 31 provinces in Chinese mainland from 1978 to 2019.【工商大数据】、【企业信息】、【enterprise registration data】。
FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Information-Extraction-Chinese
Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取
jeecg-boot
「企业级低代码平台」前后端分离架构SpringBoot 2.x,SpringCloud,Ant Design&Vue,Mybatis-plus,Shiro,JWT。强大的代码生成器让前后端代码一键生成,无需写任何代码! 引领新的开发模式OnlineCoding->代码生成->手工MERGE,帮助Java项目解决70%重复工作,让开发更关注业务,既能快速提高效率,帮助公司节省成本,同时又不失灵活性。
label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Langchain-Chatchat
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM 等语言模型的本地知识库问答 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM) QA app with langchain
llama_index
LlamaIndex (formerly GPT Index) is a data framework for your LLM applications
llms_tool
一个基于HuggingFace开发的大语言模型训练、测试工具。支持各模型的webui、终端预测,低参数量及全参数模型训练(预训练、SFT、RM、PPO、DPO)和融合、量化。
MiNLP
XiaoMi Natural Language Processing Toolkits
NCRFpp
NCRF++, an Open-source Neural Sequence Labeling Toolkit. It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components. (code for COLING/ACL 2018 paper)
paper2gui
Convert AI papers to GUI,Make it easy and convenient for everyone to use artificial intelligence technology。让每个人都简单方便的使用前沿人工智能技术
Pattern-Exploiting-Training
Pattern-Exploiting Training在中文上的简单实验
pypytorch
PyPyTorch is a simple deep learning framework implemented by Python3
pytorch-crf
Conditional random field in PyTorch.
swift
ms-swift: Use PEFT or Full-parameter to finetune 200+ LLMs or 15+ MLLMs
t5_in_bert4keras
整理一下在keras中使用T5模型的要点
tokenizers
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
toy_algorithms_in_python
algorithm implementation in python \n make it work, and then make it better.