4 bits quantization of LLaMa using GPTQ [Project to add GPT-NeoX and Pythia quant and inference]
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool