'ChatGLMTokenizer' object has no attribute 'sp_tokenizer'

Question

'ChatGLMTokenizer' object has no attribute 'sp_tokenizer'

ovjust opened this issue 9 months ago · comments

from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True) model = AutoModel.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True).half().cuda() image_path = "your image path" response, history = model.chat(tokenizer, image_path, "描述这张图片。", history=[]) print(response) response, history = model.chat(tokenizer, image_path, "这张图片可能是在什么场所拍摄的？", history=history) print(response)

运行上面的示例代码，已改了路径为从https://cloud.tsinghua.edu.cn/d/43ffb021ca5f4897b56a/ 下载的模型目录。
报错1：
visualglm-6b-model does not appear to have a file named config.json
改为从https://huggingface.co/THUDM/visualglm-6b 下载的模型，才解决。
报错2：
'ChatGLMTokenizer' object has no attribute 'sp_tokenizer'
拒说是如果 transformers==4.34.0 会报 'ChatGLMTokenizer' object has no attribute 'tokenizer'
解决方法是，降低 transformers 版本，安装下面的版本依然没有解决：
pip install transformers==4.33.2 -i https://mirrors.aliyun.com/pypi/simple/

建议：
为什么国内做的项目，还是让我们用起来各种网络不通呢？国家为啥让你们访问国外网络，不让我们访问呢？gitHub也有很多人无法访问，gitHub的邮件也很难收到。
建议同时提交一份全的代码、资源在国内的仓库。更新一下requirements.txt，指定每个库的版本，否则过不了几天别人试运行时就是版本冲突无法运行。

ovjust commented 8 months ago

reopen

Po · Answer 1 · Sat Dec 30 2023 14:39:27 GMT+0800 (China Standard Time)

关于报错2:X-D-Lab/LangChain-ChatGLM-Webui#124 (comment)

move line "self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens) " before "super().init(" at "init" of class ChatGLMTokenizer can solve this issue.

ovjust · Answer 2 · Wed Jan 24 2024 17:34:57 GMT+0800 (China Standard Time)

still error ,
Exception has occurred: RuntimeError
Internal: D:\a\sentencepiece\sentencepiece\src\sentencepiece_processor.cc(1102) [model_proto->ParseFromArray(serialized.data(), serialized.size())]
File "C:\Users\Administrator.cache\huggingface\modules\transformers_modules\visualglm-6b-model-gitee\tokenization_chatglm.py", line 22, in init
self.sp.Load(model_path)
File "C:\Users\Administrator.cache\huggingface\modules\transformers_modules\visualglm-6b-model-gitee\tokenization_chatglm.py", line 64, in init
self.text_tokenizer = TextTokenizer(vocab_file)
File "C:\Users\Administrator.cache\huggingface\modules\transformers_modules\visualglm-6b-model-gitee\tokenization_chatglm.py", line 221, in init
self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens)
File "D:\1MyFiles\code\python\visualglm-6b-main\2test.py", line 4, in
tokenizer = AutoTokenizer.from_pretrained(r"D:\1MyFiles\code\python\visualglm-6b-model-gitee", trust_remote_code=True)
RuntimeError: Internal: D:\a\sentencepiece\sentencepiece\src\sentencepiece_processor.cc(1102) [model_proto->ParseFromArray(serialized.data(), serialized.size())]

Zhangzq · Answer 3 · Tue Jan 30 2024 15:18:03 GMT+0800 (China Standard Time)

同遇到了这个问题，请问如何解决？

Zhangzq · Answer 4 · Tue Jan 30 2024 16:26:53 GMT+0800 (China Standard Time)

同遇到了这个问题，请问如何解决？

重装transformers到4.33.2就可以，亲测有效

Stanley · Answer 5 · Tue Feb 06 2024 22:13:40 GMT+0800 (China Standard Time)

同遇到了这个问题，请问如何解决？

重装transformers到4.33.2就可以，亲测有效

该方案有效，感谢层主！