PKU-YuanGroup / ChatLaw

ChatLaw:A Powerful LLM Tailored for Chinese Legal. 中文法律大模型

Home Page:https://chatlaw.cloud/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

peft model inference so slow!!

KLGR123 opened this issue · comments

image As shown, I tried set `load_in_8bit=False` or set `model = model.merge_and_unload()`, but neither work. I mean it can output result like in 2000 years later SO is there a solution yet??