THUDM / VisualGLM-6B

Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

'ChatGLMTokenizer' object has no attribute 'sp_tokenizer'

ovjust opened this issue · comments

from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True) model = AutoModel.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True).half().cuda() image_path = "your image path" response, history = model.chat(tokenizer, image_path, "描述这张图片。", history=[]) print(response) response, history = model.chat(tokenizer, image_path, "这张图片可能是在什么场所拍摄的?", history=history) print(response)

运行上面的示例代码,已改了路径为从https://cloud.tsinghua.edu.cn/d/43ffb021ca5f4897b56a/ 下载的模型目录。
报错1:
visualglm-6b-model does not appear to have a file named config.json
改为从https://huggingface.co/THUDM/visualglm-6b 下载的模型,才解决。
报错2:
'ChatGLMTokenizer' object has no attribute 'sp_tokenizer'
拒说是如果 transformers==4.34.0 会报 'ChatGLMTokenizer' object has no attribute 'tokenizer'
解决方法是,降低 transformers 版本,安装下面的版本依然没有解决:
pip install transformers==4.33.2 -i https://mirrors.aliyun.com/pypi/simple/

建议:
为什么国内做的项目,还是让我们用起来各种网络不通呢? 国家为啥让你们访问国外网络,不让我们访问呢?gitHub也有很多人无法访问,gitHub的邮件也很难收到。
建议同时提交一份全的代码、资源在国内的仓库。更新一下requirements.txt,指定每个库的版本,否则过不了几天别人试运行时就是版本冲突无法运行。

commented

关于报错2:X-D-Lab/LangChain-ChatGLM-Webui#124 (comment)

move line "self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens) " before "super().init(" at "init" of class ChatGLMTokenizer can solve this issue.

still error ,
Exception has occurred: RuntimeError
Internal: D:\a\sentencepiece\sentencepiece\src\sentencepiece_processor.cc(1102) [model_proto->ParseFromArray(serialized.data(), serialized.size())]
File "C:\Users\Administrator.cache\huggingface\modules\transformers_modules\visualglm-6b-model-gitee\tokenization_chatglm.py", line 22, in init
self.sp.Load(model_path)
File "C:\Users\Administrator.cache\huggingface\modules\transformers_modules\visualglm-6b-model-gitee\tokenization_chatglm.py", line 64, in init
self.text_tokenizer = TextTokenizer(vocab_file)
File "C:\Users\Administrator.cache\huggingface\modules\transformers_modules\visualglm-6b-model-gitee\tokenization_chatglm.py", line 221, in init
self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens)
File "D:\1MyFiles\code\python\visualglm-6b-main\2test.py", line 4, in
tokenizer = AutoTokenizer.from_pretrained(r"D:\1MyFiles\code\python\visualglm-6b-model-gitee", trust_remote_code=True)
RuntimeError: Internal: D:\a\sentencepiece\sentencepiece\src\sentencepiece_processor.cc(1102) [model_proto->ParseFromArray(serialized.data(), serialized.size())]

Snipaste_2024-01-24_17-33-07
Snipaste_2024-01-24_17-33-32

reopen

同遇到了这个问题,请问如何解决?

同遇到了这个问题,请问如何解决?

重装transformers到4.33.2就可以,亲测有效

同遇到了这个问题,请问如何解决?

重装transformers到4.33.2就可以,亲测有效

该方案有效,感谢层主!