Chengjiang's starred repositories

flash-attention

Fast and memory-efficient exact attention

Language:PythonLicense:BSD-3-ClauseStargazers:12461Issues:0Issues:0

MMLongBench-Doc

Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations

Language:PythonLicense:Apache-2.0Stargazers:24Issues:0Issues:0

Stack-Solver

Stack Solver is an app for the optimisation of palletizing and shipping items.

Language:C#License:GPL-3.0Stargazers:225Issues:0Issues:0

picx

🏞️ PicX 是一款基于 GitHub API 开发的图床工具,提供图片上传托管、生成图片链接和常用图片工具箱服务。

Language:TypeScriptLicense:AGPL-3.0Stargazers:4396Issues:0Issues:0

aseprite

Animated sprite editor & pixel art tool (Windows, macOS, Linux)

Language:C++Stargazers:27779Issues:0Issues:0

MaskDiT

Code for Fast Training of Diffusion Models with Masked Transformers

Language:PythonLicense:MITStargazers:322Issues:0Issues:0
Language:MATLABLicense:GPL-3.0Stargazers:9950Issues:0Issues:0

LivePortrait

Bring portraits to life!

Language:PythonLicense:MITStargazers:7096Issues:0Issues:0

VCR

Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.

Language:PythonLicense:CC-BY-SA-4.0Stargazers:16Issues:0Issues:0

anole

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation

Language:PythonStargazers:519Issues:0Issues:0

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Language:PythonLicense:Apache-2.0Stargazers:1565Issues:0Issues:0

ml-4m

4M: Massively Multimodal Masked Modeling

Language:PythonLicense:Apache-2.0Stargazers:1390Issues:0Issues:0

mage

A PyTorch implementation of MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis

Language:PythonLicense:MITStargazers:489Issues:0Issues:0

ImprovedNAT

A PyTorch implementation of the paper "Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis"

Language:PythonLicense:MITStargazers:18Issues:0Issues:0

MaskedVectorQuantization

Official Pytorch Implementation of Our CVPR2023 Paper: "Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation"

Language:PythonLicense:MITStargazers:47Issues:0Issues:0

Open-MAGVIT2

Open-MAGVIT2: Democratizing Autoregressive Visual Generation

Language:PythonLicense:Apache-2.0Stargazers:325Issues:0Issues:0

audioseal

Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector

Language:PythonLicense:MITStargazers:371Issues:0Issues:0

chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Language:PythonLicense:NOASSERTIONStargazers:1551Issues:0Issues:0

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonLicense:MITStargazers:20247Issues:0Issues:0

MINT-1T

MINT-1T: A one trillion token multimodal interleaved dataset.

Stargazers:97Issues:0Issues:0

vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

Language:PythonLicense:MITStargazers:2195Issues:0Issues:0

rq-vae-transformer

The official implementation of Autoregressive Image Generation using Residual Quantization (CVPR '22)

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:732Issues:0Issues:0

KataCR

A non-embedded AI for Clash Royale based on RL and CV.

Language:PythonLicense:MITStargazers:171Issues:0Issues:0

ClashRoyaleBuildABot

A platform for creating bots to play Clash Royale

Language:PythonLicense:MITStargazers:191Issues:0Issues:0

OmniTokenizer

OmniTokenizer: one model and one weight for image-video joint tokenization.

Language:PythonLicense:MITStargazers:192Issues:0Issues:0

Libra

Simple PyTorch implementation of "Libra: Building Decoupled Vision System on Large Language Models" (accepted by ICML 2024)

Language:PythonLicense:Apache-2.0Stargazers:36Issues:0Issues:0

LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Language:PythonLicense:MITStargazers:1041Issues:0Issues:0

ChatTTS

A generative speech model for daily dialogue.

Language:PythonLicense:AGPL-3.0Stargazers:27944Issues:0Issues:0

Bend

A massively parallel, high-level programming language

Language:RustLicense:Apache-2.0Stargazers:16829Issues:0Issues:0

SEED

Official implementation of SEED-LLaMA (ICLR 2024).

Language:PythonLicense:NOASSERTIONStargazers:526Issues:0Issues:0