Marcella Cornia's starred repositories
Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.
ActivityNet-Entities
A Dataset for Grounded Video Description
MLNet-Pytorch
Implementation of A Deep Multi-Level Network for Saliency Prediction in Pytorch
awesome-human-visual-attention
This repository contains a curated list of research papers and resources focusing on saliency and scanpath prediction, human attention, human visual search.
DynamicConv-agent
PyTorch code for BMVC 2019 paper: Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters
perceive-transform-and-act
PyTorch code for the paper: "Perceive, Transform, and Act: Multi-Modal Attention Networks for Vision-and-Language Navigation"