roastduck / FreeTensor

A language and compiler for irregular tensor programs.

Home Page:https://roastduck.github.io/FreeTensor/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

cublas backend of MatMul does not work with stream parallelism

roastduck opened this issue · comments

We should run cublas in an appropriate stream, and this further require to create a different cublas handle for each stream. Since we cache cublas in GPUContext, we should make the cache available for multiple streams.