yuanmeng1120/quickllm

一. 简介

LLM学习资源库。一个使用pytorch和部分Tensorflow2实现的项目，可以 本地运行和调试 的大模型LLM相关的应用和代码

再次强调：强调本地调试习惯，强调代码跳转，强调快速掌握LLM相关工作

下次更新：1.基于PyTorch的MOE模型；2.TensorRT和推理加速的相关工作

调试方式：

格式1. 只有一个脚本文件，将examples中的py脚本复制到quickllm文件夹的同级目录（当前项目根目录）

格式2. 如果调试脚本是个文件夹，推荐将依赖改成相对路径或者格式1的形式；或者在需要使用from quickllm导包调用的脚本代码中添加父级目录

import sys
sys.path.append('/path/to/directory of quickllm')  # quickllm文件夹的父级目录

核心功能： chatglm、chatglm2、llama、llama2、 baichuan、ziya、bloom等开源大模型权重进行推理和微调、prompt应用

项目优势： 使用相对路径关联代码，方便跳转查看和调试

后续更新： 除了更新更多LLM相关代码，后续会补充tf2的一些实现和服务部署的相关工作，敬请期待

视频课程：《大模型理论基础和工业落地实战》，基于triton+flask+chatglm2，自己讲的。关注公众号”NLP小讲堂“回复你的联系方式，联系作者打五折

二.使用方式(以chatglm2为例)

基本流程：

pip install -r requirements.txt -i https://pypi.douban.com/simple

1. 定义config参数和配置、加载数据集（其他参数列表参考第三部分）；

chatglm2参数下载：https://huggingface.co/THUDM/chatglm2-6b；

chatglm2数据下载：https://cloud.tsinghua.edu.cn/f/b3f119/a008264b1cabd1/?dl=1

2.编写加载prompt、定义模型（训练和验证函数）

3.载入数据，启动训练和验证，调试代码，调试完成！保存修改脚本到examples，处理下一个

快速启动： 将examples/basic/glm/basic_language_model_chatglm2.py复制到根目录下，添加断点，启动调试！

# -*- coding: utf-8 -*- 
"""
    @Project ：quickllm 
    @File    ：quickly_start.py
    @Author  ：ys
    @Time    ：2023/12/12 12:12
    官方项目：https://github.com/THUDM/ChatGLM2-6B
"""

import os
import torch
from loguru import logging
from transformers import AutoTokenizer

from quickllm.models import build_transformer_model
from quickllm.generation import SeqGeneration


class ExpertModel:

    def __init__(self):
        self.prompt = "请以一位医疗领域知识图谱专家的角色回答以下问题："
        self.choice = 'default'  # default, int4, 32k
        self.device = 'cuda' if torch.cuda.is_available() else 'cpu'

        # 来自https://huggingface.co/THUDM/chatglm2-6b；
        self.dir_path = "/path/to/my/pretrain_ckpt/glm/chatglm2-6B"
        self.checkpoint_path = [os.path.join(dir_path, i) for i in os.listdir(dir_path) if i.endswith('.bin')]
        # 来自项目中的：examples/basic/glm/chatglm2-6B/quickllm_config.json
        self.config_path = dir_path + '/quickllm_config.json'

    def build_prompt(self, history):
        for query, response in history:
            self.prompt += f"\n\n用户：{query}"
            self.prompt += f"\n\nChatGLM-6B：{response}"
        return self.prompt

    def build_model(self):
        tokenizer = AutoTokenizer.from_pretrained(self.dir_path.replace('/', '\\'), trust_remote_code=True)
        if self.choice in {'default', '32k'}:
            encoder = build_transformer_model(config_path=self.config_path,
                                              checkpoint_path=self.checkpoint_path).half().to(device)
        else:
            encoder = build_transformer_model(config_path=self.config_path,
                                              checkpoint_path=self.checkpoint_path).to(device)

        model = SeqGeneration(encoder, tokenizer, start_id=None, end_id=tokenizer.eos_token_id, mode='random_sample',
                              maxlen=2048, default_rtype='logits', use_states=True)
        return model

    def chat(self, query, history):
        prompt = ""
        for i, (old_query, response) in enumerate(history):
            prompt += "[Round {}]\n\n问：{}\n\n答：{}\n".format(i + 1, old_query, response)
        prompt += "[Round {}]\n\n问：{}\n\n答：".format(len(history) + 1, query)

        for response in self.build_model().stream_generate(prompt, topk=50, topp=0.7, temperature=0.95):
            new_history = history + [(query, response)]
            yield response, new_history

    def main(self):
        history = []
        logging.info("----欢迎使用ChatGLM2-6B模型，修改prompt输入内容进行对话，clear清空对话历史，stop终止程序")
        while True:
            query = input("\nQuestion：")
            if query.strip() == "stop":
                break
            if query.strip() == "clear":
                history = []
                print("----已清空历史对话----")
                continue
            for response, history in self.chat(query, history=history):
                print(build_prompt(history), flush=True)

            print(build_prompt(history), flush=True)
            torch.cuda.empty_cache()


if __name__ == '__main__':
    expert_bot = ExpertModel()
    expert_bot.main()

快速微调：将examples/llm/task_chatglm2_lora.py文件转移至根目录下，添加断点，启动调试！

快速部署： 将examples/serving/basic_simple_web_serving_simbert.py文件转移至根目录下，添加断点，启动调试！

三. 预训练权重

若无说明则使用权重自带的pytorch_model.bin和config.json

模型分类	模型名称	权重来源	权重链接	备注(若有)
chatglm	chatglm-6b	THUDM	github, v0.1.0, v1.1.0, int8, int4	config
	chatglm2-6b	THUDM	github, v2, int4, 32k	config
	chatglm3-6b	THUDM	github, v3, 32k	config
bert	bert-base-chinese	谷歌bert的torch版	torch	config
	chinese_L-12_H-768_A-12	谷歌	github, tf	转换命令, config
	chinese-bert-wwm-ext	HFL	tf/torch，torch
	bert-base-multilingual-cased	huggingface	torch	config
	macbert	HFL	tf/torch，torch
	wobert	追一科技	tf，torch_base，torch_plus_base
	guwenbert	ethanyt	torch	config
roberta	chinese-roberta-wwm-ext	HFL	tf/torch，torch
	roberta-small/tiny	追一科技 & UER	tf，torch	转换脚本
	roberta-base-english	huggingface	torch	config
albert	albert	brightmart	tf，torch，torch
nezha	NEZHA	华为	tf，torch
xlnet	chinese-xlnet	HFL	tf/torch	config
deberta	Erlangshen-DeBERTa-v2	IDEA	torch
electra	Chinese-ELECTRA	HFL	tf，torch
ernie	ernie	百度文心	paddle，torch
roformer	roformer	追一科技	tf，torch
	roformer_v2	追一科技	tf，torch
simbert	simbert	追一科技	tf，torch_base	转换脚本
	simbert_v2/roformer-sim	追一科技	tf，torch
gau	GAU-alpha	追一科技	tf	转换脚本
gpt	CDial-GPT	thu-coai	torch	config
gpt2	cmp_lm(26亿)	清华	torch	config
	gpt2-chinese-cluecorpussmall	UER	torch	config
	gpt2-ml	imcaspar	tf，torch	config
bart	bart_base_chinese	复旦fnlp	torch, v1.0, v2.0	config
t5	t5	UER	torch	config_base, config_small
	mt5	谷歌	torch	config
	t5_pegasus	追一科技	tf	config_base, config_small
	chatyuan v1&v2	clue-ai	torch	config
	PromptCLUE	clue-ai	torch	config
llama	llama	facebook	github	config
	llama-2	facebook	github, 7b, 7b-chat, 13b, 13b-chat	config
	chinese_llama_alpaca	HFL	github	config
	Belle_llama	LianjiaTech	github, 7B-2M-enc	合成说明、config
	Ziya	IDEA-CCNL	v1, v1.1, pretrain-v1	config
	Baichuan	baichuan-inc	github, 7B, 13B-Base, 13B-Chat	config
	Baichuan2	baichuan-inc	github, 7B-Base, 7B-Chat, 13B-Base, 13B-Chat	config
	vicuna	lmsys	7b-v1.5	config
	Yi	01-ai	github, 6B, 6B-200K	config
bloom	bloom	bigscience	bloom-560m, bloomz-560m	config
Qwen	Qwen	阿里云	github, 7B, 7B-Chat	config
InternLM	InternLM	上海人工智能实验室	github, 7B-Chat, 7B	config
Falcon	Falcon	tiiuae	hf, RW-1B, 7B, 7B-Instruct	config
embedding	text2vec-base-chinese	shibing624	torch
	m3e	moka-ai	torch	config
	bge	BAAI	torch	config

四. 最新更新：

20231210：增加Tensorflow2的ChatGLM实现
20231208：增加Tensorflow2的各类激活函数实现
20231203：增加Tensorflow2的GlobalPointer实现
20231201：增加Tensorflow2的ROPE函数实现
20231122：增加Tensorflow2的LORA实现

五. 鸣谢

感谢Tongjilibo的bert4torch，本实现重点参考了这个项目，进行了优化和更新；项目会持续跟进bert4torch的最新实现

感谢苏神实现的bert4keras，有些地方参考了bert4keras的源码，在此衷心感谢大佬的无私奉献；大佬的科学空间

@misc{bert4torch,
  title={bert4torch},
  author={Bo Li},
  year={2022},
  howpublished={\url{https://github.com/Tongjilibo/bert4torch}},
}

六. 引用

@misc{quickllm,
  title={quickllm},
  author={NLP小讲堂},
  year={2022},
  howpublished={\url{https://github.com/zysNLP/quickllm}},
}

6. 其他

微信号

微信群

yuanmeng1120 / quickllm