valencebond

Jian's starred repositories

RecFormer

Replication of the paper "Text Is All You Need: Learning Language Representations for Sequential Recommendation" on KDD'23.

Language:Python7000

OpenP5

OpenP5: An Open-Source Platform for Developing, Training, and Evaluating LLM-based Recommender Systems

Language:PythonApache-2.019600

Open-MAGVIT2

Open-MAGVIT2: Democratizing Autoregressive Visual Generation

Language:PythonApache-2.034600

Kolors

Kolors Team

Language:PythonApache-2.0278100

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Language:PythonApache-2.0810500

awesome-video-generation

A collection of awesome video generation studies.

Language:TeXMIT18600

DiG

DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention

Language:PythonMIT10000

magvit2-pytorch

Implementation of MagViT2 Tokenizer in Pytorch

Language:PythonMIT50100

Vript

Language:PythonNOASSERTION9400

HunyuanDiT

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Language:PythonNOASSERTION294300

pytube

A lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.

Language:PythonUnlicense1173400

recognize-anything

Open-source and strong foundation image recognition models.

Language:Jupyter NotebookApache-2.0263400

LLaVA-NeXT

Language:Python138200

PLLaVA

Official repository for the paper PLLaVA

Language:Python50100

MultimodalRecSys

A curated list of awesome resources about multimodal recommender systems.

GPL-3.025900

LaVIT

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content

Language:Jupyter NotebookNOASSERTION46200

MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Language:PythonApache-2.0311000

Awesome-CVPR2024-ECCV2024-AIGC

A Collection of Papers and Codes for CVPR2024/ECCV2024 AIGC

36800

MiniGPT4-video

Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding

Language:PythonBSD-3-Clause48700

nvitop

An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.

Language:PythonApache-2.0435900

k-diffusion

Karras et al. (2022) diffusion models for PyTorch

Language:PythonMIT220900

fvd-comparison

Comparison between Frechet Video Distance implementation from StyleGAN-V and the original paper

Language:Python7000

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonApache-2.02096300

OOTDiffusion

Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

Language:PythonNOASSERTION520700

Panda-70M

[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Language:Python46100

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonMIT1103900

minisora

MiniSora: A community aims to explore the implementation path and future development direction of Sora.

Language:PythonApache-2.0113400

LVDM

LVDM: Latent Video Diffusion Models for High-Fidelity Long Video Generation

Language:PythonMIT42800

LaVie

LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models

Language:PythonApache-2.079300

PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Language:PythonAGPL-3.0259100