How can you intercept cudaSetupArgument or cudaSetupArg in CUDA Runtime?
dujiangsu opened this issue · comments
Hi,
I am working on GPU pooling these days. When I try to intercept kernel functions, I am quite puzzled about how to intercept cudaSetupArgument [in older version], or cudaSetupArg [which is a macro in cuda runtime].
I don't find the related solution in your repo, could you help me?
Jiangsu
你好,
参考各种项目,我目前已经成功拦截了cudaRegisterFatbinary, cudaRegisterFunction这些在老版本里使用的函数,并成功在主机上跑通子机的程序。但是在cuda10.2之后,又新增了cudaRegisterFatbinaryEnd, cudaPushConfigure, cudaPopConfigure这些函数,更诡异的是cudaSetupArgument被写成了一个宏导致我不知道如何拦截。不知道您能否给我讲解一下其中的逻辑,或者告诉我您是如何处理核函数传参问题的。
祝秋祺。