JuliaGPU / CUDA.jl

CUDA programming in Julia.

Home Page:https://juliagpu.org/cuda/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using copyto! with SharedArray trigger scalar indexing disallowed error

Jdogzz opened this issue · comments

Describe the bug

Using copyto! with a SharedArray triggers a scalar indexing disallowed error, while using an otherwise identical regular array does not.

To reproduce

The Minimal Working Example (MWE) for this bug:

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.10.2 (2024-03-01)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using CUDA, SharedArrays

julia> gpuarraystack=CuArray{UInt8}(undef,5,100,100);

julia> frame=SharedArray{UInt8}((100,100));

julia> copyto!(frame,gpuarraystack[1,:,:]);
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

If you want to allow scalar iteration, use `allowscalar` or `@allowscalar`
to enable scalar iteration globally or for the operations in question.
Stacktrace:
 [1] error(s::String)
   @ Base .\error.jl:35
 [2] errorscalar(op::String)
   @ GPUArraysCore C:\Users\myusername\.julia\packages\GPUArraysCore\GMsgk\src\GPUArraysCore.jl:155
 [3] _assertscalar(op::String, behavior::GPUArraysCore.ScalarIndexing)
   @ GPUArraysCore C:\Users\myusername\.julia\packages\GPUArraysCore\GMsgk\src\GPUArraysCore.jl:128
 [4] assertscalar(op::String)
   @ GPUArraysCore C:\Users\myusername\.julia\packages\GPUArraysCore\GMsgk\src\GPUArraysCore.jl:116
 [5] getindex(A::CuArray{UInt8, 2, CUDA.Mem.DeviceBuffer}, I::Int64)
   @ GPUArrays C:\Users\myusername\.julia\packages\GPUArrays\OKkAu\src\host\indexing.jl:48
 [6] copyto_unaliased!(deststyle::IndexLinear, dest::SharedMatrix{…}, srcstyle::IndexLinear, src::CuArray{…})
   @ Base .\abstractarray.jl:1088
 [7] copyto!(dest::SharedMatrix{UInt8}, src::CuArray{UInt8, 2, CUDA.Mem.DeviceBuffer})
   @ Base .\abstractarray.jl:1068
 [8] top-level scope
   @ REPL[4]:1
 [9] top-level scope
   @ C:\Users\myusername\.julia\packages\CUDA\htRwP\src\initialization.jl:206
Some type information was truncated. Use `show(err)` to see complete types.

julia> framenormal=Array{UInt8,2}(undef,100,100);

julia> copyto!(framenormal,gpuarraystack[1,:,:]);

julia>
Manifest.toml

CUDA v5.2.0
GPUArrays v10.1.0
GPUCompiler v0.25.0
LLVM v6.6.3

Expected behavior

I expect the SharedArrays array to work identically to the normally declared array when copying to a GPU array, and not trigger a scalar indexing error.

Version info

Details on Julia:

Julia Version 1.10.2
Commit bd47eca2c8 (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 12 × Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
Threads: 1 default, 0 interactive, 1 GC (on 12 virtual cores)

Details on CUDA:

CUDA runtime 12.3, artifact installation
CUDA driver 12.4
Unknown NVIDIA driver

CUDA libraries:
- CUBLAS: 12.3.4
- CURAND: 10.3.4
- CUFFT: 11.0.12
- CUSOLVER: 11.5.4
- CUSPARSE: 12.2.0
- CUPTI: 21.0.0
- NVML: missing

Julia packages:
- CUDA: 5.2.0
- CUDA_Driver_jll: 0.7.0+1
- CUDA_Runtime_jll: 0.11.1+0

Toolchain:
- Julia: 1.10.2
- LLVM: 15.0.7

1 device:
  0: NVIDIA GeForce GTX 1060 3GB (sm_61, 2.369 GiB / 3.000 GiB available)

I expect the SharedArrays array to work identically to the normally declared array

To correct the expectation here, SharedArray is a non-standard array type (there are many in the Julia community) and we don't necessarily support all possible array types in CUDA.jl/GPUArrays.

As noted by @vchuravy, we only strive to support array wrappers that are part of the Julia standard library, or some very popular arrays outside of that. Other support would have to happen in the upstream repository, e.g., as an extension package.

FYI, you can also try to use unified memory (using cu(; unified=true) or by switching the global CUDA.jl preference); although still in development it does away with the scalar iteration errors.