WenH's starred repositories
chatgpt-corpus
ChatGPT 中文语料库 对话语料 小说语料 客服语料 用于训练大模型
chatterbot-corpus
A multilingual dialog corpus
chinese-chatbot-corpus
中文公开聊天语料库
Awesome-Continual-Learning
A curated list of Continual Learning papers and BibTeX entries
IE-Datasets-Collections
中英文信息抽取数据集整理
stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!
LLaMA-Factory
Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
Awesome-LLMs-Datasets
Summarize existing representative LLMs text datasets.
Awesome-LLM-Interpretability
A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..
awesome-chatgpt-prompts-zh
ChatGPT 中文调教指南。各种场景使用指南。学习怎么让它听你的话。
awesome-instruction-datasets
A collection of awesome-prompt-datasets, awesome-instruction-dataset, to train ChatLLM such as chatgpt 收录各种各样的指令数据集, 用于训练 ChatLLM 模型。
Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
prompt2model
prompt2model - Generate Deployable Models from Natural Language Instructions
sparse-probing-paper
Sparse probing paper full code.