Ferdinand Mom's repositories
3outeille.github.io
My website
AutoGPTQ
An easy-to-use model quantization package with user-friendly apis, based on GPTQ algorithm.
candle
Minimalist ML framework for Rust
ChatRWKV
ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
ggml
Tensor library for machine learning
gptcore
Fast modular code to create and train cutting edge LLMs
GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
lightning-GPT
Train and run GPTs with Lightning
mamba-minimal
Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
megatron-smol-cluster
Megatron-LM setup in the smol-cluster
MS-AMP
Microsoft Automatic Mixed Precision Library
nanotron
Minimalistic large language model 3D-parallelism training
OBC
Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".
pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
RWKV-CUDA
The CUDA version of the RWKV language model ( https://github.com/BlinkDL/RWKV-LM )
RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
too-many-lists
Learn Rust by writing Entirely Too Many linked lists
whisper-jax
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.