Aniki's repositories
CRIS.pytorch
An official PyTorch implementation of the CRIS paper
Awesome-Cross-Modal-Video-Moment-Retrieval
前沿论文持续更新--视频时刻定位 or 时域语言定位 or 视频片段检索。
awesome-language-model-with-vision
Related about vision and language models
Awesome-Segment-Anything
Collect some resource about Segment Anything (SAM), including the latest papers and demo
awesome-source-free-test-time-adaptation
[2022] A curated list of papers in Test-time Adaptation, Test-time Training and Source-free Domain Adaptation
Awesome-Video-Diffusion
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
Awesome-Video-Diffusion-Models
[Arxiv] A Survey on Video Diffusion Models
datacomp
DataComp: In search of the next generation of multimodal datasets
langchain
⚡ Building applications with LLMs through composability ⚡
LayoutGPT
Official repo for LayoutGPT
MaskCLIP
Official PyTorch implementation of "Extract Free Dense Labels from CLIP" (ECCV 22 Oral)
PaLM-rlhf-pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
pytorch-image-models
PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more
VideoX
VideoX: a collection of video cross-modal models
Gen-L-Video
The official implementation for "Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising".
LLM-in-Vision
Recent LLM-based CV and related works. Welcome to comment/contribute!
MedSegDiff
Official implementation of paper "MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model"
RPG-DiffusionMaster
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)
SciencePlots
Matplotlib styles for scientific plotting
segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
viper
Code for the paper "ViperGPT: Visual Inference via Python Execution for Reasoning"
visual-chatgpt
Official repo for the paper: Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models