pkuanjie

followers

following

stars

University of Rochester

Rochester, NY, US

Jie An's starred repositories

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Language:PythonApache-2.023907 191 3759

generative-models

Generative Models by Stability AI

Language:PythonMIT23216 249 277

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonApache-2.020206 176 353

PyTorch-GAN

PyTorch implementations of Generative Adversarial Networks.

Language:PythonMIT15988 221 156

latent-diffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Language:Jupyter NotebookMIT11040 96 336

Awesome-Diffusion-Models

A collection of resources and papers on Diffusion Models

Language:HTMLMIT10396 268 43

Tune-A-Video

[ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation

Language:PythonApache-2.04144 49 94

sd-forge-layerdiffuse

[WIP] Layer Diffusion for WebUI (via Forge)

Language:PythonApache-2.03642 35 90

LMOps

General technology for enabling AI capabilities w/ LLMs and MLLMs

Language:PythonMIT3379 57 100

awesome-tips

Awesome-Visual-Transformer

Collect some papers about transformer with vision. Awesome Transformer with Computer Vision (CV)

PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Language:PythonAGPL-3.02524 460

CogView

Text-to-Image generation. The repo for NeurIPS 2021 paper "CogView: Mastering Text-to-Image Generation via Transformers".

Language:PythonApache-2.01628 56 63

Emu

Emu Series: Generative Multimodal Models from BAAI

Language:PythonApache-2.01560 21 85

PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Language:PythonAGPL-3.01433 38 100

LAMA

LAnguage Model Analysis

Language:PythonNOASSERTION1323 72 48

video-diffusion-pytorch

Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch

Language:PythonMIT1185 28 34

Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

LibFewShot

LibFewShot: A Comprehensive Library for Few-shot Learning. TPAMI 2023.

Language:PythonMIT867 25 73

MotionDiffuse

MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model

Language:PythonNOASSERTION807 29 33

ClipBERT

[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.

Language:PythonMIT693 9 58

xlnet-Pytorch

Simple XLNet implementation with Pytorch Wrapper

Language:Jupyter NotebookApache-2.0574 15 17

unified-io-2

Language:PythonApache-2.0535 15 16

remi

"Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions", ACM Multimedia 2020

Language:PythonGPL-3.0529 14 37

LVDM

LVDM: Latent Video Diffusion Models for High-Fidelity Long Video Generation

Language:PythonMIT422 27 21

VQ-Diffusion

Language:PythonMIT421 6 30

LLaVA-RLHF

Aligning LMMs with Factually Augmented RLHF

Language:PythonGPL-3.0272 8 30

AlignProp

AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods (PPO) for finetuning Stable Diffusion

Language:PythonMIT203 7 13

ConsistI2V

ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation (TMLR 2024)

Language:PythonMIT173 16 18

d3po

[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"

Language:PythonMIT132 7 10