lyogavin / Anima

33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU

Will the airllm framework be adapted for the streaming output functionality of different models in the future?

wangqn1 opened this issue 5 months ago · comments

wqn commented 5 months ago

Will the airllm framework be adapted for the streaming output functionality of different models in the future?