HamiltonHuaji's starred repositories
NVIDIA_SGEMM_PRACTICE
Step-by-step optimization of CUDA SGEMM
InterProcessPyObjects
High-performance and seamless sharing and modification of Python objects between processes, without the periodic overhead of serialization and deserialization. Provides fast inter-process communication (IPC) via shared memory. Supports NumPy, Torch arrays, custom classes (including dataclass), classes with methods, and asyncio
fight-for-the-open-web
F**k off Google.
ThunderKittens
Tile primitives for speedy kernels
machine-learning-articles
🧠💬 Articles I wrote about machine learning, archived from MachineCurve.com.
Neural_3D_Video
The repository for CVPR 2022 Paper "Neural 3D Video Synthesis"
defender-control
An open-source windows defender manager. Now you can disable windows defender permanently.
glsl_analyzer
Language server for GLSL (autocomplete, goto-definition, formatter, and more)
Awesome-Feature-Learning-in-Deep-Learning-Thoery
Welcome to the Awesome Feature Learning in Deep Learning Thoery Reading Group! This repository serves as a collaborative platform for scholars, enthusiasts, and anyone interested in delving into the fascinating world of feature learning within deep learning theory.
linux-exploit-suggester
Linux privilege escalation auditing tool
AlphaTrade
JAX-LOB: A GPU-Accelerated limit order book simulator to unlock large scale reinforcement learning for trading
fuchsia_radix_sort
The Vulkan GPU radix sort implementation from Google Fuchsia, but with CMake
ComputerGraphicsKnowledge
Computer Graphics and Game Development Knowledge
LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
GPU-Reshape
GPU Reshape (GRS) is an API agnostic instrumentation framework, with instruction level validation.
flash_attn_jax
JAX bindings for Flash Attention v2
high-performance-go
high performance coding with golang(Go 语言高性能编程,Go 语言陷阱,Gotchas,Traps)