skyguy126's starred repositories
Optimizing-SGEMM-on-NVIDIA-Turing-GPUs
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
SGEMM_CUDA
Fast CUDA matrix multiplication from scratch
simpleGEMM
The simplest but fast implementation of matrix multiplication in CUDA.
cuda-tutorial
A set of hands-on tutorials for CUDA programming
cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
ShiftAddLLM
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
dji-firmware-tools
Tools for handling firmwares of DJI products, with focus on quadcopters.
pytorch-cnn-visualizations
Pytorch implementation of convolutional neural network visualization techniques
build-your-own-x
Master programming by recreating your favorite technologies from scratch.
onetimesecret
Keep passwords and other sensitive information out of your inboxes and chat logs.