Elliot Gorokhovsky's starred repositories
vectorscan
A portable fork of the high-performance regular expression matching library
flash-attention
Fast and memory-efficient exact attention
CUDA-Programming
Sample codes for my CUDA programming book
ThunderKittens
Tile primitives for speedy kernels
mlir-tutorial
MLIR For Beginners tutorial
CuAssembler
An unofficial cuda assembler, for all generations of SASS, hopefully :)
Triton-Puzzles
Puzzles for learning Triton
optimized-routines
Optimized implementations of various library functions for ARM architecture processors
firecracker
Secure and fast microVMs for serverless computing.
composable_kernel
Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
oss-fuzz-gen
LLM powered fuzzing via OSS-Fuzz.
libphonenumber
Google's common Java, C++ and JavaScript library for parsing, formatting, and validating international phone numbers.
StackRabbit
An AI for playing NES Tetris at a high level. Based primarily on search & heuristic, with high quality board evaluation through value iteration.
llvm-tutor
A collection of out-of-tree LLVM passes for teaching and learning
quickstack
A tool to take call stack traces with minimal overheads