Mark Saroufim's repositories
awesome-profiling
Awesome utilities for performance profiling
mlsys-experiments
stuff
Triton-Puzzles
Puzzles for learning Triton
algorithmic-efficiency
MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models.
Liger-Kernel
Efficient Triton Kernels for LLM Training
llama-inference
experiments with inference on llama
llm.c
LLM training in simple, raw C/CUDA
lm-evaluation-harness
A framework for few-shot evaluation of language models.
nvcc4jupyter
A plugin for Jupyter Notebook to run CUDA C/C++ code
pyperformance
Python Performance Benchmark Suite
pytorch.github.io
The website for PyTorch
segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.