usyd-fsalab / fp6_llm

An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

usyd-fsalab/fp6_llm Issues