Xuweiyi Chen's repositories
AnimateDiff
Official implementation of AnimateDiff.
AnimateDiff-MotionDirector
MotionDirector Training For AnimateDiff. Train a MotionLoRA and run it on any compatible AnimateDiff UI.
ArCHer
Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"
ControlNet
Let us control diffusion models!
DAT
Repository of Vision Transformer with Deformable Attention (CVPR2022) and DAT++: Spatially Dynamic Vision Transformerwith Deformable Attention
datasets
🎁 5,400,000+ Unsplash images made available for research and machine learning
deformable-attention
Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"
dream-in-4d
Official PyTorch implementation of "A Unified Approach for Text- and Image-guided 4D Scene Generation", [CVPR 2024]
dust3r
DUSt3R: Geometric 3D Vision Made Easy
DynamiCrafter
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
ELLA
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
embodied-generalist
Official code repository for 3D embodied generalist agent LEO
FeatUp
Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024
grok-1
Grok open release
GroundingDINO
Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
InternVL
[CVPR 2024] InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks —— An Open-Source Alternative to ViT-22B
jtd-remote
Example of Just the Docs as a remote theme
LLaVA_Attn_Control
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
LLM-groundedDiffusion
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LLM-grounded Diffusion: LMD)
open_flamingo
An open-source framework for training large multimodal models.
StoryDiffusion
Create Magic Story!
transformers_attn_control
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
trl
Train transformer language models with reinforcement learning.
VideoCrafter
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation
zero123plus
Code repository for Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model.