BAICHUAN2没有MakeInput的实现

Question

BAICHUAN2没有MakeInput的实现

yiguanxian opened this issue 7 months ago · comments

BAICHUANl类模型没有这个接口：virtual std::string MakeInput(const std::string &history, int round, const std::string &input)，那是不是需要我自己去按官方的方式拼接构造prompt

TylunasLi · Answer 1 · Sat Jan 06 2024 00:02:05 GMT+0800 (China Standard Time)

baichuan2模型使用fastllm::LlamaModel 类实现，该类实现了MakeInput()方法。
特殊token prompt使用"<FLM_FIX_TOKEN_{id}>"格式，在转换的时候存储在模型文件中。

因此，无需您自己实现MakeInput()方法。

buchidanhuanger · Answer 2 · Sat Jan 06 2024 03:48:41 GMT+0800 (China Standard Time)

baichuan2有没有example，我感觉有些掉精度。不知道是不是我的用法问题，我是这么用的：
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers.generation.utils import GenerationConfig
from fastllm_pytools import llm

modelpath = "baichuan-inc/Baichuan2-13B-Chat"
tokenizer = AutoTokenizer.from_pretrained(modelpath, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(modelpath, device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True)
model.generation_config = GenerationConfig.from_pretrained(modelpath)
model = llm.from_hf(model, tokenizer, dtype = "int8")
model.stream_response("怎么创建用户")

buchidanhuanger · Answer 3 · Sat Jan 06 2024 12:12:19 GMT+0800 (China Standard Time)

@TylunasLi 如上，baichuan2不知道是否需要按原始模型中的方式自行处理下输入，再去调用fastllm中的stream_response

TylunasLi · Answer 4 · Sun Jan 07 2024 19:27:22 GMT+0800 (China Standard Time)

@TylunasLi 如上，baichuan2不知道是否需要按原始模型中的方式自行处理下输入，再去调用fastllm中的stream_response

排查了一下，目前fastllm中 hf_model.create() 方法和torch2flm.tofile()处理Baichuan 2 的逻辑不同，导致两行代码加速无效，正计划提交PR修复。

buchidanhuanger · Answer 5 · Mon Jan 08 2024 22:07:36 GMT+0800 (China Standard Time)

@TylunasLi

如上述回答，那现在的版本应该用hf_model.create() 方法还是torch2flm.tofile()呢？
2.另外，按我上述代码，fastllm是如何确定我的模型类型的呢？（如何确定是baichuan2还是chatglmd等）

wenzhaojie · Answer 6 · Sat Jan 27 2024 12:14:35 GMT+0800 (China Standard Time)

batch pyfastllm推理baichuan-13b-chat，为何输出结果很差的呢？batch response: <Response [200]> text:
"(1/4)
prompt: 你好，请问你是谁？
response:
(2/4)
prompt: 今天天气怎么样？
response: ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
(3/4)
prompt: How are you？
response: - 2019-0000-000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
(4/4)
prompt: こんにちは
response: 。私は田中です。これは私の新しいアパートです。私はここです。これは私の部屋です。これは私の机です。これは私の本です。これは私の音楽です。これは私のカフェーです。これは私の町です。これは
"

TylunasLi · Answer 7 · Thu Feb 22 2024 23:30:15 GMT+0800 (China Standard Time)

@wenzhaojie

看了下现有的batch_response示例，现在web_api实现的是generate方法，没有组装prompt，所以会给出非常奇怪的回答。