octoml / mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

Home Page:https://mlc.ai/mlc-llm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Remaining pieces for upstreaming

sunggg opened this issue · comments

These are prerequisite for making mlc-serve an independent package.

  • Mixtral support @vinx13
  • vLLM v2 kernel @vinx13
  • Misc changes in core.py for mlc-serve-specific artifact dump @sunggg
  • Batched model support for split + rotary fusion (mlc_llm/transform/fuse_split_rotary_embedding.py). This one depends on a hack to TVM