yalesong / pvse

Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval (CVPR 2019)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Not sure how MIL loss is computed

chinmay5 opened this issue · comments

First up, thank you so much for the wonderful code. I still have one issue which I am not able to understand from the code. In the code we are dealing with MIL loss but I am not sure how it is implemented using the MaxPool idea. This may be partly due to my lack of understanding about MIL itself. If there is any resource which you think can help me here, I shall be really grateful.

Hi, thanks for the kind words! The MaxPool2d is there to compute the maximum similarity from each of the KxK embedding pairs. Note that in the paper we write it as the minimum over distances, which is the same as the maximum over similarities.