NVIDIA / MatX

An efficient C++17 GPU numerical computing library with Python-like syntax

Home Page:https://nvidia.github.io/MatX

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[FEA] Allow stride 0 batched GEMMs

cliffburdick opened this issue · comments

cuBLAS allows a batch stride of 0 on A or B so one or both matrices don't need to be repeated in memory. Use this feature if needed.