Lightning-Universe / stable-diffusion-deploy

Learn to serve Stable Diffusion models on cloud infrastructure at scale. This Lightning App shows load-balancing, orchestrating, pre-provisioning, dynamic batching, GPU-inference, micro-services working together via the Lightning Apps framework.

Home Page:https://lightning.ai/muse

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

improve auto batching logic

aniketmaurya opened this issue · comments

RIght now automatic batching is done in the LoadBalancer which is sequential.

We can move it to Model server level for more concurrency.

@ethanwharris and @aniketmaurya are checking this