Lightning-Universe / stable-diffusion-deploy

Learn to serve Stable Diffusion models on cloud infrastructure at scale. This Lightning App shows load-balancing, orchestrating, pre-provisioning, dynamic batching, GPU-inference, micro-services working together via the Lightning Apps framework.

https://lightning.ai/muse

improve auto batching logic

aniketmaurya opened this issue 2 years ago · comments

Aniket Maurya commented 2 years ago

RIght now automatic batching is done in the LoadBalancer which is sequential.

We can move it to Model server level for more concurrency.

Aniket Maurya commented 2 years ago

@ethanwharris and @aniketmaurya are checking this