AndreJJXu

AndreJJXu

Geek Repo

Github PK Tool:Github PK Tool

AndreJJXu's starred repositories

Vim

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Language:PythonLicense:Apache-2.0Stargazers:2653Issues:0Issues:0
Language:MATLABLicense:GPL-3.0Stargazers:10198Issues:0Issues:0

Diff-Foley

Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models

Language:PythonLicense:Apache-2.0Stargazers:136Issues:0Issues:0

align_sd

Better Aligning Text-to-Image Models with Human Preference. ICCV 2023

Language:PythonLicense:Apache-2.0Stargazers:258Issues:0Issues:0

chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Language:PythonLicense:NOASSERTIONStargazers:1589Issues:0Issues:0

VGen

Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models

Language:PythonStargazers:2798Issues:0Issues:0

RevIN

RevIN: Reversible Instance Normalization For Accurate Time-series Forecasting Against Distribution Shift

Language:PythonLicense:MITStargazers:232Issues:0Issues:0

ai-audio-datasets

AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.

License:MITStargazers:404Issues:0Issues:0

audio2photoreal

Code and dataset for photorealistic Codec Avatars driven from audio

Language:PythonLicense:NOASSERTIONStargazers:2629Issues:0Issues:0

T2I-CompBench

[Neurips 2023] T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation

Language:PythonLicense:MITStargazers:174Issues:0Issues:0

pykan

Kolmogorov Arnold Networks

Language:Jupyter NotebookLicense:MITStargazers:13908Issues:0Issues:0

cobra

Cobra: Extending Mamba to Multi-modal Large Language Model for Efficient Inference

Language:PythonLicense:MITStargazers:226Issues:0Issues:0
Language:PythonStargazers:11Issues:0Issues:0

DG-SCT

NeurIPS'2023 official implementation code

Language:PythonStargazers:53Issues:0Issues:0

LAVISH

Vision Transformers are Parameter-Efficient Audio-Visual Learners

Language:PythonStargazers:80Issues:0Issues:0

audio-dataset

Audio Dataset for training CLAP and other models

Language:PythonStargazers:606Issues:0Issues:0

CLAP

Contrastive Language-Audio Pretraining

Language:PythonLicense:CC0-1.0Stargazers:1266Issues:0Issues:0

RPG-DiffusionMaster

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)

Language:Jupyter NotebookStargazers:1612Issues:0Issues:0

Panda-70M

[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Language:PythonStargazers:457Issues:0Issues:0

ViTamin

[CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"

Language:PythonLicense:Apache-2.0Stargazers:156Issues:0Issues:0

SceneWiz3D

[CVPR 2024] SceneWiz3D: Towards Text-guided 3D Scene Composition

Stargazers:91Issues:0Issues:0

MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Language:PythonLicense:Apache-2.0Stargazers:3108Issues:0Issues:0

ControlNet-v1-1-nightly

Nightly release of ControlNet 1.1

Language:PythonStargazers:4546Issues:0Issues:0

ControlNet

Let us control diffusion models!

Language:PythonLicense:Apache-2.0Stargazers:29184Issues:0Issues:0
License:CC-BY-4.0Stargazers:798Issues:0Issues:0

blended-latent-diffusion

Official implementation for "Blended Latent Diffusion" [SIGGRAPH 2023]

Language:Jupyter NotebookLicense:MITStargazers:544Issues:0Issues:0

SyncDiffusion

Official implementation of SyncDiffusion.

Language:Jupyter NotebookLicense:MITStargazers:142Issues:0Issues:0

MultiDiffusion

Official Pytorch Implementation for "MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation" presenting "MultiDiffusion" (ICML 2023)

Language:Jupyter NotebookStargazers:950Issues:0Issues:0

ModalBiasAVSR

Offical implementation of the CVPR 2024 paper: A Study of Dropout-Induced Modality Bias on Robustness to Missing Video.

Stargazers:8Issues:0Issues:0

clotho-dataset

Python code for handling the Clotho dataset.

Language:PythonLicense:NOASSERTIONStargazers:74Issues:0Issues:0