The tutorial of Cross-Modal Matching / Pretraining / Transfering will be constantly updated for Preliminary Insight !
【2024.03.09】 A new section named [Large Multi-Modality Model] has been added.
【2023.05.25】 A new section named [Parameter-Efficient Finetuning] has been added.
【2021.12.11】 A new section named [Video-Text Learning] has been added.
【2021.07.10】 A new section named [Vision-Language Pretraining] has been added.
-
- Large Multi-Modality Model
- Parameter-Efficient Finetuning
- Vision-Language Pretraining
- Conventional Image-Text Matching
- Generic-Feature Extraction
- Cross-Modal Interaction
- Similarity Measurement
- Uncertainty Learning
- Noisy Correspondence
- Commonsense Learning
- Adversarial Learning
- Loss Function
- Un-/Semi-Supervised
- Zero-/Fewer-Shot
- Continual Learning
- Identification Learning
- Video-Text Learning
- Scene-Text Learning
- Related Works
- Posted in
MIT license. If any questions, please contact me at r1228240468@gmail.com.