Lukas Blecher's starred repositories
stable-diffusion
A latent text-to-image diffusion model
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Swin-Transformer
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
tortoise-tts
A multi-voice TTS system trained with an emphasis on quality
RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
DALLE-pytorch
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
taming-transformers
Taming Transformers for High-Resolution Image Synthesis
AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
softmax-splatting
an implementation of softmax splatting for differentiable forward warping using PyTorch
math-formula-recognition
Math formula recognition (Images to LaTeX strings)
TransDepth
Code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction
formula_gan
Unsupervised Training Data Generation of Handwritten Formulas using Generative Adversarial Networks with Self-Attention