scaleapi / llm-engine

Scale LLM Engine public repository

Home Page:https://llm-engine.scale.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Further reduction of pod cold start time

yunfeng-scale opened this issue · comments

I believe we could do < 20s for llama 2 7b models.