BryanPlummer / two_branch_networks

Pytorch implementation of "Learning Deep Structure-Preserving Image-Text Embeddings"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About the loss function and the textual feature

JCZ404 opened this issue · comments

commented

Hi, Thanks for your great work! But I have some problems with the loss function in your code. First, in the original paper, the author said he used the logistic regression loss function, but in your code, it seems you only calculate the positive and negative pair loss between the sentence and the image, Second, I wonder which task your code focus on, because in the original paper, it focus on the phrase grounding, however, in your code, it seems you didn't deal with the phrase in the caption, instead you treated the caption as a whole, could you give a little bit explanation about this?

Hi. Thank you for the excellent work. Could I ask you a question about your loss function? Although the code works fine, the loss value is always nan. Is there something wrong?