İlker Kesen's repositories
adapter-transformers
Huggingface Transformers + Adapters = ❤️
ilkerkesen
Repository for my bio
Ask-Anything
[VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
Awesome-Referring-Image-Segmentation
:books: A collection of papers about Referring Image Segmentation.
caption_metrics
Evaluation Metrics for Image Captioning
colorfromlanguage
Code base of the paper : Learning to Color from Language
colorization
Automatic colorization using deep neural networks. "Colorful Image Colorization." In ECCV, 2016.
DeepLabV3Plus-Pytorch
DeepLabv3, DeepLabv3+ and pretrained weights on VOC & Cityscapes
frozen-in-time
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]
ilkerkesen.github.io
Personal Website
MCQ
Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).
mPLUG-2
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)
pytorch-deeplab-xception
DeepLab v3+ model in PyTorch. Support different backbones.
singularity
[ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"
Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding