Vchitect / Latte

Latte: Latent Diffusion Transformer for Video Generation.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What is

olliacc opened this issue · comments

commented

I'm asking for the lowest amount of GPU video memory (VRAM) necessary to run latte video generation effectively? for both training and inference.

I'm asking for the lowest amount of GPU video memory (VRAM) necessary to run latte video generation effectively? for both training and inference.

Hi, thanks for your interest. Inferencing one video on the A100 requires 20916MiB of GPU memory under fp16 precision mode. As for the GPU memory requirement of training, I think it may be dependent on your batch size.

@maxin-cn
May i set the local bz=1 for training latte on my own dataset? I mean, I heard that the enough batchsize seems to be key for the training of diffusion.
image

Hi, you can set batchsize as 1. But I'm not sure if this will slow down performance. You can try it first. Looking forward to your feedback later~