TiledTensor / TiledCUDA

TiledCUDA is a highly efficient kernel template library designed to elevate CUDA C’s level of abstraction for processing tiles.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pass unittest for tensor core gemm.

haruhi55 opened this issue · comments

There are still several cases where the tensor core GEMM does not pass the unit tests. The implementation contains some bugs.

// run_test<32, 32, 32, tl::RowMajor<2, 1>, 32>();
// run_test<64, 64, 32, tl::RowMajor<2, 2>, 32>();
// run_test<64, 32, 128, tl::RowMajor<2, 2>, 32>();