huziyuan14's repositories
Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. Meanwhile, we created a new branch to build a Tabular LLM.(我们分别统一了丰富的IFT数据(如CoT数据,目前仍不断扩充)、多种训练效率方法(如lora,p-tuning)以及多种LLMs,三个层面上的接口,打造方便研究人员上手的LLM-IFT研究平台。同时tabular_llm分支构建了面向表格智能任务的LLM。
alpaca-lora
Instruct-tune LLaMA on consumer hardware
BELLE
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
bookcorpus
Crawl BookCorpus
chatGLM-6B-QLoRA
使用peft库,对chatGLM-6B实现4bit的QLoRA高效微调,并做lora model和base model的merge及4bit的量化(quantize)。
ChatGLM-Tuning
一种平价的chatgpt实现方案, 基于ChatGLM-6B + LoRA
chatgpt-corpus
ChatGPT 中文语料库 对话语料 小说语料 客服语料 用于训练大模型
ChatLM-mini-Chinese
中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调。
Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
LexiconAugmentedNER
Reject complicated operations for incorporating lexicon for Chinese NER.
qlora
QLoRA: Efficient Finetuning of Quantized LLMs
Chinese-Medical-Entity-Recognition
Using BERT+Bi-LSTM+CRF
ChineseNLPCorpus
中文自然语言处理数据集,平时做做实验的材料。欢迎补充提交合并。
CLUEDatasetSearch
搜索所有中文NLP数据集,附常用英文NLP数据集
ColossalAI
Making large AI models cheaper, faster and more accessible
hugging-multi-agent
A tutorial based on MetaGPT to quickly help you understand the concept of agent and muti-agent and get started with coding development
KnowLM
Knowledgable Large Language Model Framework.
Linly
Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集
lit-llama
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
LLaMA-Efficient-Tuning
Easy-to-use fine-tuning framework using PEFT (PT+SFT+RLHF with QLoRA) (LLaMA-2, BLOOM, Falcon, Baichuan, Qwen)
LLM-SFT
中文大模型微调(LLM-SFT), 数学指令数据集MWP-Instruct, 支持模型(ChatGLM-6B, LLaMA, Bloom-7B, baichuan-7B), 支持(LoRA, QLoRA, DeepSpeed, UI, TensorboardX), 支持(微调, 推理, 测评, 接口)等.
LLM-Tuning
Tuning LLMs with no tears💦, sharing LLM-tools with love❤️.
NLP2
Some application models of natural language processing
Phi2-mini-Chinese
Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.
RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
text-generation-webui
A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA.
Transformer-pytorch
Transformer, pytorch, python
transformers_tasks
⭐️ NLP Algorithms with transformers lib. Supporting Text-Classification, Text-Generation, Information-Extraction, Text-Matching, RLHF, SFT etc.