Kumara Kahatapitiya's starred repositories
Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
magic-animate
[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
StableCascade
Official Code for Stable Cascade
LLaMA-Adapter
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
VideoCrafter
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Awesome-Video-Diffusion
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Awesome-LLMs-for-Video-Understanding
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
MotionCtrl
Official Code for MotionCtrl [SIGGRAPH 2024]
Video-ChatGPT
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
visual_anagrams
Code for the paper "Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models"
X3D-Multigrid
PyTorch implementation of X3D models with Multigrid training.
Coarse-Fine-Networks
Code for our CVPR 2021 paper "Coarse-Fine Networks for Temporal Activity Detection in Videos"
crossway_diffusion
The official code of our ICRA'24 paper Crossway Diffusion: Improving Diffusion-based Visuomotor Policy via Self-supervised Learning
lifelong-memory
Code for LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videos
LinearConv
Code for our WACV 2021 paper "Exploiting the Redundancy in Convolutional Filters for Parameter Reduction"
open_x_pytorch_dataloader
An unofficial pytorch dataloader for Open X-Embodiment Datasets https://github.com/google-deepmind/open_x_embodiment