There are 0 repository under tensor-cores topic.
⚡️Write HGEMM from scratch using Tensor Cores with WMMA, MMA and CuTe API, Achieve Peak⚡️ Performance.
Vulkan & GLSL implementation of FlashAttention-2
A benchmarking framework for correlators of FX telescope arrays
📚FFPA(Split-D): Yet another Faster Flash Prefill Attention with O(1) GPU SRAM complexity for headdim > 256, ~2x↑🎉vs SDPA EA.
Vulkan & GLSL implementation of FlashAttention-2