SoyeonHH / MMML

Studies of MultiModal Machine Learning

MMML

Studies of MultiModal Machine Learning

Courses

11-777 MMML

Papers

Index	Category	Model	Paper
1	Survey	Multimodal Video Sentiment Analysis	Multimodal Video Sentiment Analysis Using Deep Learning Approaches, a Survey
2	Fusion	Bi-GRU	Video multimodal emotion recognition based on Bi-GRU and attention fusion
3	Fusion	TFN	Tensor Fusion Network for Multimodal Sentiment Analysis
4	Fusion	MFN	Memory Fusion Network for Multi-view Sequential Learning
5	Pretraining	MAG-BERT, MAT-XLNet	Integrating Multimodal Information in Large Pretrained Transformers
6	Pretraining	BERT	BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
7	Pretraining	VideoBERT	VideoBERT: A Joint Model for Video and Language Representation Learning
8	Survey	Transformer based Video-Language Pre-Training	Survey: Transformer based Video-Language Pre-training
9	Pretraining	UniVL	UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation
10	Pretrainig	VATT	VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
11	Pretraining	Audio DistilBERT	Audio DistilBERT: A Distilled Audio BERT for Speech Representation Learning
12	dataset	CMU-MOSI	MOSI: Multimodal Corpus of Sentiment Intensity and Subjectivity Analysis in Online Opinion Videos
13	dataset	HowTo100M	HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips

References

Index	Category	Contents
1	Contrastive Learning (CL)	Understanding Contrastive Learning, Extending Contrastive Learning to the Supervised Setting
2	Video datasets	CMU-MultimodalSDK, CMU-MOSI, HowTo100M
3	Latent Space	Understanding Latent Space in Machine Learning

About

Studies of MultiModal Machine Learning