Gongyu Wang's repositories
Snoopie
Multi-GPU communication profiler and visualizer
nccl-tests
NCCL Tests
pykan
Kolmogorov Arnold Networks
HolisticTraceAnalysis
A library to analyze PyTorch traces.
invrs-io_gym
A collection of inverse design challenges
OpenVAF
An innovative Verilog-A compiler
mojo
The Mojo Programming Language
chipyard
An Agile RISC-V SoC Design Framework with in-order cores, out-of-order cores, accelerators, and more
timeloop
Timeloop performs modeling, mapping and code-generation for Tensor Algebra workloads running on Explicitly-Decoupled Data Orchestration (EDDO) architectures.
scrcpy
Display and control your Android device
jaxtyping
Type annotations and runtime checking for shape and dtype of JAX arrays, and PyTrees.
torchtyping
Type annotations and dynamic checking for a tensor's shape, dtype, names, etc.
xla
A machine learning compiler for GPUs, CPUs, and ML accelerators
llama.cpp
Port of Facebook's LLaMA model in C/C++
circt
Circuit IR Compilers and Tools
composable_kernel
Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
tree-sitter
An incremental parsing system for programming tools
apollo
An open autonomous driving platform
TensorRT
TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.
dotfiles
My dotfiles (e.g. .vimrc, .tmux.conf)
kakoune
mawww's experiment for a better code editor
dana
Test/benchmark regression and comparison system with dashboard
booksim2
BookSim 2.0
googletest
GoogleTest - Google Testing and Mocking Framework
jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more