MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models.
Go ahead and axolotl questions
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110).
ImageBind One Embedding Space to Bind Them All
Simple (fast) transformer inference in PyTorch with torch.compile + lit-llama code
experiments with inference on llama
Inference Llama 2 in one file of pure C
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
An unnecessarily tiny implementation of GPT-2 in NumPy.
Prototype routines for GPU quantization written using PyTorch.
Python Performance Benchmark Suite
Serve, optimize and scale PyTorch models in production