efeslab / fiddler

Fast Inference of MoE Models with CPU-GPU Orchestration

Home Page:https://arxiv.org/abs/2402.07033

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

efeslab/fiddler Watchers