[BUG] 4bit模型的结果混乱

Question

HaoLiuHust opened this issue a month ago · comments

在T4显卡上，使用web_demo_2.5.py载入4bit模型，输入图片问问题，基本上都是乱答。然后又使用huggingface上的代码启动，也乱答。

No response

No response

- OS:ubuntu 18.04
- Python:3.9
- Transformers:
- PyTorch:
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):12

No response

Hongji Zhu · Answer 1 · Mon Jun 17 2024 19:41:20 GMT+0800 (China Standard Time)

请提供原图以便我们复现问题

CoderInCV · Answer 2 · Tue Jun 18 2024 09:51:26 GMT+0800 (China Standard Time)

使用lmdeloy量化和部署正常