MagicFrogSJTU / yolov5

Lines 313 to 316 in 96fa40a

    
           torch.cuda.synchronize() 
        
           if ema is not None: 
        
               ema.update(model) 
        
           optimizer.zero_grad()

Hello, may I understand why you perform optimizer.zero_grad() after ema.update(model)?
Also, is there a reason to call .cuda.synchronize() in optimize? I read that DDP will perform synchronize when it's needed.

These actually doesn't make sense.
Just add them to keep the code similar to others.
You can saftely keep the original way.

	torch.cuda.synchronize()
	if ema is not None:
	ema.update(model)
	optimizer.zero_grad()

Optimizer zero grad performed after ema update