jllllll/exllama Issues
Run on CPU without AVX2
UpdatedStrange output
Closed 2Cant import exllama
Closed 5Llama v2 70b
Closed 2nvidia jetson orin wheel ?
Updated 1
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.