Maitreyapatel

Maitreya Patel's starred repositories

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonApache-2.022108 186 490

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonMIT11503 154 344

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonMIT4198 115 81

Vim

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Language:PythonApache-2.02951 30 113

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Language:PythonMIT1307 27 48

ELLA

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Language:PythonApache-2.01084 42 47

rcg

PyTorch implementation of RCG https://arxiv.org/abs/2312.03701

Language:PythonMIT820 7 37

sdxs

Official repo of our paper "SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions"

Language:PythonApache-2.0603 24 18

AnglE

Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard

Language:PythonMIT474 10 49

piecewise-rectified-flow

PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator (NeurIPS 2024)

Language:Jupyter NotebookBSD-3-Clause437 17 11

awesome-text-to-image-studies

A collection of awesome text-to-image generation studies.

Language:TeXMIT407 120

Ctrl-Adapter

Official implementation of Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model

Language:PythonApache-2.0388 21 25

MACE

[CVPR 2024] "MACE: Mass Concept Erasure in Diffusion Models" (Official Implementation)

Language:PythonMIT351 2 14

awesome-video-generation

A collection of awesome video generation studies.

Language:TeXMIT325 13 1

LaVi-Bridge

[ECCV 2024] Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation

Language:PythonMIT313 16 16

hf_transfer

Language:RustApache-2.0310 33 16

PnPInversion

[ICLR2024] Official repo for paper "PnP Inversion: Boosting Diffusion-based Editing with 3 Lines of Code"

Language:Jupyter Notebook249 6 13

ml-veclip

The official repo for the paper "VeCLIP: Improving CLIP Training via Visual-enriched Captions"

Language:Jupyter NotebookNOASSERTION228 150

VTimeLLM

[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".

Language:PythonNOASSERTION218 2 37

StyleID

[CVPR 2024 Highlight] Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer

Language:PythonMIT210 3 22

Awesome_Long_Form_Video_Understanding

Awesome papers & datasets specifically focused on long-term videos.

191 10 1

TokenCompose

(CVPR 2024) 🧩 TokenCompose: Text-to-Image Diffusion with Token-level Supervision

Language:Jupyter NotebookApache-2.0111 3 9

FreeStyle

FreeStyle : Free Lunch for Text-guided Style Transfer using Diffusion Models

Language:Python109 5 9

RealCompo

[NeurIPS 2024] RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models

Language:Python106 4 4

SpLiCE

Sparse Linear Concept Embeddings

Language:PythonApache-2.064 3 4

unified-io-2.pytorch

Language:PythonApache-2.062 7 12

Momentor

Language:Python50 7 9

edit-one-for-all

✏️ Edit One for All: Interactive Batch Image Editing (CVPR 2024)

Language:Python50 4 3

DAC

Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models

Language:PythonNOASSERTION25 2 1

WOUAF

WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models (CVPR 2024)

Language:Jupyter NotebookApache-2.012 1 2