EleutherAI's repositories
lm-evaluation-harness
A framework for few-shot evaluation of language models.
concept-erasure
Erasing concepts from neural representations with provable guarantees
DeeperSpeed
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
improved-t5
Experiments for efforts to train a new and improved t5
rnngineering
Engineering the state of RNN language models (Mamba, RWKV, etc.)
features-across-time
Understanding how features learned by neural networks evolve throughout training
elk-generalization
Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from easy questions to hard
tokengrams
Efficiently computing & storing token n-grams from large corpora
variance-across-time
Studying the variance in neural net predictions across training time
tuned-lens
Tools for understanding how transformer predictions are built layer-by-layer
bayesian-adam
Exactly what it says on the tin
conceptual-constraints
Applying LEACE to models during training
RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
cupbearer
A library for mechanistic anomaly detection
TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.