Maitreyapatel

Maitreya Patel's starred repositories

fiftyone

The open-source tool for building high-quality datasets and computer vision models

Language:PythonApache-2.08010 55 1494

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonMIT3937 114 77

Vim

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Language:PythonApache-2.02728 30 106

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Language:PythonMIT1075 21 31

rcg

PyTorch implementation of RCG https://arxiv.org/abs/2312.03701

Language:PythonMIT784 7 33

sdxs

Official repo of our paper "SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions"

Language:PythonApache-2.0579 26 16

AnglE

Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard

Language:PythonMIT431 11 46

Ctrl-Adapter

Official implementation of Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model

Language:PythonApache-2.0364 22 22

minRF

Minimal implementation of scalable rectified flow transformers, based on SD3's approach

Language:Jupyter NotebookApache-2.0353 6 9

Video-MME

✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

347 5 25

FIFO-Diffusion_public

Official implementation of FIFO-Diffusion: Generating Infinite Videos from Text without Training

Language:Python324 11 25

LaVi-Bridge

[ECCV 2024] Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation

Language:PythonMIT297 16 16

MACE

[CVPR 2024] "MACE: Mass Concept Erasure in Diffusion Models" (Official Implementation)

Language:PythonMIT287 2 11

hf_transfer

Language:RustApache-2.0276 33 15

awesome-video-generation

A collection of awesome video generation studies.

Language:TeXMIT215 90

VTimeLLM

[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".

Language:PythonNOASSERTION194 2 30

perception_test

Language:Jupyter NotebookApache-2.0178 10 24

StyleID

[CVPR 2024 Highlight] Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer

Language:PythonMIT153 3 12

d3po

[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"

Language:PythonMIT149 7 13

Awesome_Long_Form_Video_Understanding

Awesome papers & datasets specifically focused on long-term videos.

136 90

FouriScale

Official implementation of FouriScale (ECCV2024)

Language:PythonApache-2.0127 11 7

TokenCompose

(CVPR 2024) 🧩 TokenCompose: Text-to-Image Diffusion with Token-level Supervision

Language:Jupyter NotebookApache-2.0104 3 9

FreeStyle

FreeStyle : Free Lunch for Text-guided Style Transfer using Diffusion Models

Language:Python100 5 7

unified-io-2.pytorch

Language:PythonApache-2.058 6 9

edit-one-for-all

✏️ Edit One for All: Interactive Batch Image Editing (CVPR 2024)

Language:Python45 4 3

SpLiCE

Sparse Linear Concept Embeddings

Language:PythonApache-2.043 3 4

Momentor

Language:Python43 7 8

DAC

Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models

Language:PythonNOASSERTION23 2 1

ID-Preserving-Facial-Aging

Identity-Preserving Aging of Face Images via Latent Diffusion Models [IJCB 2023]

Language:Jupyter NotebookMIT17 2 4

WOUAF

WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models (CVPR 2024)

Language:Jupyter NotebookApache-2.010 10