Panchovix / exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Panchovix/exllama Stargazers