L1 loss or SmoothL1Loss?
jonathan016 opened this issue · comments
Hi, I've been reading through the code and I found that L1 loss is used instead of Smooth L1 loss for localization loss. This is quite different from the paper's procedure, where as far as I know SSD uses Smooth L1 loss.
https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection/blob/master/model.py#L549
self.smooth_l1 = nn.L1Loss()
https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection/blob/master/model.py#L612
loc_loss = self.smooth_l1(predicted_locs[positive_priors], true_locs[positive_priors]) # (), scalar
My questions are:
- Has anyone tried changing the loss function to
SmoothL1Loss
as implemented in PyTorch as of right now? - If it has been tried, is the result similar to what SSD achieves?
Thank you in advance.
@jonathan016 did you try it yet ? I'll try and post result here once I have some results.
Hi @adityag6994 , I didn't try it due to my research's limited resources and time. However, using L1Loss
seems to still help the learning process from what I observed in my experiments. I look forward to seeing the results you obtained from using SmoothL1Loss
Okay, I started the experiment will have something by tomorrow. Same, L1Loss worked for me on the dateset I tried it on.
Side question, I just noticed there is no Softmax being used when calculating cross Entropy https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection/blob/master/model.py#L629, do you have any idea, or am I missing something here ?
Thanks,
Aditya
@jonathan016 so I changed https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection/blob/master/model.py#L549
self.smooth_l1 = nn.L1Loss()
to this
self.smooth_l1 = nn.SmoothL1Loss()
and there was less than 1% drop on Mean Average Precision (mAP)
for me.
Wow, that's interesting @adityag6994, thanks for the experiment! At least now we know which works best for your case, since different datasets may require different approach 😁
Anyway, softmax is calculated implicitly in CrossEntropyLoss
as CrossEntropyLoss
is actualy LogSoftmax
applied with NLLLoss
if I'm not mistaken (see https://pytorch.org/docs/master/nn.html#torch.nn.CrossEntropyLoss). With that said, when inferencing, you need to explicitly apply softmax function to the model's output if you're going after probability values.
That makes sense now. Thank you @jonathan016
With the experiment results provided by @adityag6994, I believe my questions have been answered. Closing this issue for now. Thanks a lot @adityag6994!