kazuto1011 / deeplab-pytorch

PyTorch re-implementation of DeepLab v2 on COCO-Stuff / PASCAL VOC datasets

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

iter_loss

wuzuowuyou opened this issue · comments

First thank you for your meticulous work!

iter_loss = 0
            for logit in logits:
                # Resize labels for {100%, 75%, 50%, Max} logits
                _, _, H, W = logit.shape

                # print("path_img=",path_img)
                labels_ = resize_labels(labels, size=(H, W))
                iter_loss += criterion(logit, labels_.to(device))

            # Propagate backward (just compute gradients)
            iter_loss /= CONFIG.SOLVER.ITER_SIZE
            iter_loss.backward()

why iter_loss /= CONFIG.SOLVER.ITER_SIZE

instead of iter_loss /= logits.size()

The line is not to average the multiple logits but is to make the accumulated gradients invariant to the number of iteration ITER_SIZE. The block accumulates 1/N-scaled gradients by N times and then update the parameters with N/N-magnitude gradients. It is equivalent to compute the raw gradients once and update parameters immediately. The common trick to save memory.