Soundness issues (3)
Narsil opened this issue · comments
And another one. I am opening a new issue since I'm not quite sure they are related (one is during the drop, the other during normal usage).
let dev0 = CudaDevice::new(0).unwrap();
let dev1 = CudaDevice::new(1).unwrap();
let slice = dev0.htod_copy(vec![1.0; 10]).unwrap();
let out = dev0.dtoh_sync_copy(&slice).unwrap();
This panicks with error:
thread 'tests::dummy' panicked at 'called `Result::unwrap()` on an `Err` value: DriverError(CUDA_ERROR_INVALID_VALUE, "invalid argument")', src/lib.rs:103:47
stack backtrace:
0: rust_begin_unwind
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:578:5
1: core::panicking::panic_fmt
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/panicking.rs:67:14
2: core::result::unwrap_failed
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/result.rs:1687:5
3: core::result::Result<T,E>::unwrap
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/result.rs:1089:23
4: cudarc::tests::dummy
at ./src/lib.rs:103:19
5: cudarc::tests::dummy::{{closure}}
at ./src/lib.rs:99:16
6: core::ops::function::FnOnce::call_once
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/ops/function.rs:250:5
7: core::ops::function::FnOnce::call_once
at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
My understanding is that since Device::new(1)
has been created, cuda global context is actually now targetting device 1, meaning the device_ptr for slice
is actually invalid.
The only "simple" fix I see is protecting every safe operation by result::ctx::set_current(cu_primary_ctx)?;
.
Is that correct ?
If that's the case, wouldn't a sort of Mutex::lock()
be more effective at preventing this kind of issue ?
Happy to try and provide PRs for fixes.
The only "simple" fix I see is protecting every safe operation by result::ctx::set_current(cu_primary_ctx)?;. Is that correct ?
Yes, making this fix resolves both this panic and #160.