There are 1 repository under cuda-kernel topic.
Fast, differentiable sorting and ranking in PyTorch
row-major matmul optimization
A performance comparison of standard matrix functions between CPU and GPU using Nvidia CUDA on Visual Studio using C++
Winning submission for StartHack 2024: HPC optimized multi-GPU/CPU inference
SNU CSE Scalable High Performance Computing (M1522.006700) - 2023 Autumn
a custom CUDA kernel for windowed matrix multiplication
A beginner's guide to CUDA programming
Snippet repository for learning parallel GPU programming with CUDA.