Gongyu Wang's repositories
apollo
An open autonomous driving platform
booksim2
BookSim 2.0
caffe
Caffe: a fast open framework for deep learning.
chipyard
An Agile RISC-V SoC Design Framework with in-order cores, out-of-order cores, accelerators, and more
circt
Circuit IR Compilers and Tools
composable_kernel
Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
dana
Test/benchmark regression and comparison system with dashboard
dotfiles
My dotfiles (e.g. .vimrc, .tmux.conf)
googletest
GoogleTest - Google Testing and Mocking Framework
HolisticTraceAnalysis
A library to analyze PyTorch traces.
invrs-io_gym
A collection of inverse design challenges
jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
jaxtyping
Type annotations and runtime checking for shape and dtype of JAX arrays, and PyTrees.
kakoune
mawww's experiment for a better code editor
llama.cpp
Port of Facebook's LLaMA model in C/C++
mojo
The Mojo Programming Language
nccl-tests
NCCL Tests
OpenVAF
An innovative Verilog-A compiler
pykan
Kolmogorov Arnold Networks
scrcpy
Display and control your Android device
TensorRT
TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.
timeloop
Timeloop performs modeling, mapping and code-generation for Tensor Algebra workloads running on Explicitly-Decoupled Data Orchestration (EDDO) architectures.
torchtyping
Type annotations and dynamic checking for a tensor's shape, dtype, names, etc.
tree-sitter
An incremental parsing system for programming tools
xla
A machine learning compiler for GPUs, CPUs, and ML accelerators