Fengxi ZHANG's starred repositories
1d-tokenizer
This repo contains the code for our paper An Image is Worth 32 Tokens for Reconstruction and Generation
vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch
MaMMUT-pytorch
Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch
patchify.py
A library that helps you split image into small, overlappable patches, and merge patches into original image.
L3C-PyTorch
PyTorch Implementation of the CVPR'19 Paper "Practical Full Resolution Learned Lossless Image Compression"
imageio-flif
imageio plugin with FLIF wrapper for Python
llama3-from-scratch
llama3 implementation one matrix multiplication at a time
vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
CompressAI-Vision
CompressAI-Vision helps you design, test and compare Video Compression for Machines pipelines. Compression methods can be either pulled from custom AI-based modules from CompressAI or traditional codecs such as H.266/VVC.
CompressAI
A PyTorch library and evaluation platform for end-to-end compression research
sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
magvit2-pytorch
Implementation of MagViT2 Tokenizer in Pytorch