33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool
wangqn1 opened this issue 5 months ago · comments
Will the airllm framework be adapted for the streaming output functionality of different models in the future?