WongKinYiu / yolov7

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problem wher run learning on 32 batch-size

Artem-N opened this issue · comments

Hello, I have the following problem - when training the yolov7 model (a dataset of 10k images with 90k instances) - training on 16 batch sizes with 8 dataloaders goes well and one epoch is about 1-2 minutes, when I try to increase the batch size to 32 with 8 or 6 dataloaders - two epochs of 2-3 minutes each pass, and on the 3rd epoch learning simply gets stuck, and one epoch lasts about 20 minutes, and then it can expand and the epoch will take 2-3 waves, and it can continue to freeze and learn for 20 minutes each epoch.

when I teach Yolo 8 or Yolo 9 for 32 epochs, there is no such problem.

hardware - intel i9, cpu 64 gb, gpu - nvidia gforce 4090rtx 24gb

i don`t know what happend

and when i run same on 30 batch - all goes well