cuda_malloc_async deleter w/ multiple devices.
burlen opened this issue · comments
streams are associated with a particular context. contexts are per-thread. when we change threads we have to change streams. the stream is captured by the deleter at the time of creation and can't be updated. deleteing the memory from another thread is problematic.