There are 2 repositories under cross-modal-learning topic.
Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)
CVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
[ICLR 2023] Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?
Code, dataset and models for our CVPR 2022 publication "Text2Pos"
In this work, we implement different cross-modal learning schemes such as Siamese Network, Correlational Network and Deep Cross-Modal Projection Learning model and study their performance. We also propose a modified Deep Cross-Modal Projection Learning model that uses a different image feature extractor. We evaluate the model’s performance on image-text retrieval on a fashion clothing dataset.
[IJBHI 2023] This is the official implementation of CAMANet: Class Activation Map Guided Attention Network for Radiology Report Generation accepted to IEEE Journal of Biomedical and Health Informatics (J-BHI), 2023.
Original PyTorch implementation of the code for the paper "Straight to the Point: Fast-forwarding Videos via Reinforcement Learning Using Textual Data" at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
Code for Limbacher, T., Özdenizci, O., & Legenstein, R. (2022). Memory-enriched computation and learning in spiking neural networks through Hebbian plasticity. arXiv preprint arXiv:2205.11276.
This project creates the T4SA 2.0 dataset, i.e. a big set of data to train visual models for Sentiment Analysis in the Twitter domain using a cross-modal student-teacher approach.
An intentionally simple Image to Food cross-modal search. Created by Prithiviraj Damodaran.
This is the code for our ICCV'19 paper on cross-modal learning and retrieval.
[ECCV 2024] Official Implementation of "TrajPrompt: Aligning Color Trajectory with Vision-Language Representations"
We design a cross-modal GAN which learns image-to-image modality transformation across cross-domain. This network is able to synthesize Infrared images from VISIBLE images for VEDAI dataset