A model which takes pre-trained image and text features and outputs the similarity between them. Implemented in PyTorch. Trained on the Flickr30K dataset.
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool