Confusion about an equation in the paper
sunshineatnoon opened this issue · comments
SunshineAtNoon commented
Thanks for making the code public. This is a great work!
This issue is not about the code, but I feel a little confused about the 11th equation in the paper, since the relevant code is not available.
What does i indicates in the above equation? and does t refer to an index of an image fragment or a sentence fragment? Also maybe maximizing this term makes more sense? I would really appreciate it if you can point out my misunderstandings here. Thanks!