π€‘π's starred repositories
contrastors
Train Models Contrastively in Pytorch
LivePortrait
Bring portraits to life!
ego4d-goalstep
Ego4D Goal-Step: Toward Hierarchical Understanding of Procedural Activities (NeurIPS 2023)
common_metrics_on_video_quality
You can easily calculate FVD, PSNR, SSIM, LPIPS for evaluating the quality of generated or predicted videos.
VideoHallucer
VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)
tiny-diffusion
A minimal PyTorch implementation of probabilistic diffusion models for 2D datasets.
LLaVA-RLHF
Aligning LMMs with Factually Augmented RLHF
videollm-online
VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)
Math-LLaVA
Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models
VoCo-LLaMA
VoCo-LLaMA: This repo is the official implementation of "VoCo-LLaMA: Towards Vision Compression with Large Language Models".
WebDesignAgent
WebDesignAgent : Towards Effortless Website Creation
Recap-DataComp-1B
This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"
OmniTokenizer
OmniTokenizer: one model and one weight for image-video joint tokenization.