Marcella Cornia (marcellacornia)

marcellacornia

Geek Repo

Company:AImageLab, University of Modena and Reggio Emilia

Location:Modena, Italy

Github PK Tool:Github PK Tool


Organizations
aimagelab

Marcella Cornia's starred repositories

CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Language:Jupyter NotebookLicense:MITStargazers:22963Issues:315Issues:385

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.

LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:9018Issues:95Issues:617
Language:PythonLicense:Apache-2.0Stargazers:952Issues:19Issues:96

ActivityNet-Entities

A Dataset for Grounded Video Description

Language:PythonLicense:NOASSERTIONStargazers:157Issues:18Issues:9

pacscore

Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation. CVPR 2023

MLNet-Pytorch

Implementation of A Deep Multi-Level Network for Saliency Prediction in Pytorch

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:30Issues:4Issues:4

awesome-human-visual-attention

This repository contains a curated list of research papers and resources focusing on saliency and scanpath prediction, human attention, human visual search.

Stargazers:21Issues:0Issues:0

DynamicConv-agent

PyTorch code for BMVC 2019 paper: Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters

Language:C++License:MITStargazers:21Issues:4Issues:0

perceive-transform-and-act

PyTorch code for the paper: "Perceive, Transform, and Act: Multi-Modal Attention Networks for Vision-and-Language Navigation"

Language:C++License:MITStargazers:19Issues:4Issues:1

MaPeT

Learning to Mask and Permute Visual Tokens for Vision Transformer Pre-Training

Language:PythonLicense:MITStargazers:13Issues:5Issues:1