Zasder3 / train-CLIP

A PyTorch Lightning solution to training OpenAI's CLIP from scratch.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

what's the meaning of minibatch_size?

firestonelib opened this issue · comments

Thank you for your CLIP training code! That's great!

Training with your new commit 8d454de code, I get the following error:
RuntimeError: The expanded size of the tensor (0) must match the existing size (8) at non-singleton dimension 0. Target sizes: [0, 1024]. Tensor sizes: [8, 1024]

images_tmp[self.global_rank][j*self.minibatch_size:(j+1)*self.minibatch_size] = F.normalize(self.model.encode_image(mb), dim=1)
minibatch_size = 0
Would you please explain the meaning of mimibatch_size ? How to use minibatch_size?

Really good question, I'll make sure to update the readme to include a description of all flags. The flag minibatch_size is used to chunk the larger batch_size into more manageable chunks for memory. For example, let's say you want to hit a batch size of 1,024, but that won't fit on your GPU. You can split that into smaller batches that will only contain 16 images with the flag --minibatch_size 16.

This may be renamed to micro batching in future versions for the sake of correctness.

During testing I realized that there is faulty handling of the minibatch_size I'm updating it to fix the default assignment!

Hello, this conversation is insightful. Thanks.

Why do we need this distinction between minibatch_size and batch_size? If the batch_size does not fit in the GPU to do a training step, why don't we simply make the batch_size smaller and minibatch_size==batch_size?