A GPU-accelerated DNN inference serving system that supports instant kernel preemption and biased concurrent execution in GPU scheduling.
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool