fynv / ThrustRTC

CUDA tool set for non-C++ languages that provides similar functionality like Thrust, with NVRTC at its core.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

API addition to expose sync/wait functionality

slayoo opened this issue · comments

Currently when profiling code that uses ThrustRTC, execution times are in a potentially misleading way associated with subsequent API calls - apparently due to asyncronous execution. Would it be possible to expose some wait/sync feature in the ThrustRTC API so that one could enforce completion of execution of launched kernels?

or perhaps two versions:

  • launch()
  • launch_and_wait()

or

  • launch(wait=False)
  • launch(wait=True)

ping? :)

Hey, sorry. Wasn't on this these days. Will take a look.
I didn't consider this as an issue, because all device->host transfers are now using the synchronized copy APIs. However, yes, profiling is a good point.

thanks!

Added Wait() just now. (eb94b45)

kudos! Will try soon

It does work! Thank you!