m4rs-mt / ILGPU

ILGPU JIT Compiler for high-performance .Net GPU programs

Home Page:http://www.ilgpu.net

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Tensor Cores

lostmsu opened this issue · comments

commented

A way to utilize tensor cores is needed, which should draw from the family of VectorXXX intrinsics in .NET and/or Vulkan Cooperative Matrix extension proposed by NVidia.

Related CUDA documentation: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions

This is also mentioned in #923 , but the later is more about the support for shorter floats in general.

Thanks a lot for your feature request. Given the performance improvements that can be achieved using Tensor Cores on NVIDIA hardware, it definitely makes sense to add support for Tensor Cores in 2.0 (which is going to be the next big release after v1.5).