[FEA] Allow stride 0 batched GEMMs

Question

cliffburdick opened this issue 10 months ago · comments

cuBLAS allows a batch stride of 0 on A or B so one or both matrices don't need to be repeated in memory. Use this feature if needed.