tloen / llama-int8

Quantized inference code for LLaMA models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

tloen/llama-int8 Stargazers