JuliaGPU / CUDA.jl

CUDA programming in Julia.

Home Page:https://juliagpu.org/cuda/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CUBLAS: nrm2 support for StridedCuArray with length requiring Int64

tipfom opened this issue · comments

Currently, the CUDA.CUBLAS.nrm2 wrapper function fails, when the length of the array is greater than supported by Int32.
This also causes the LinearAlgebra.norm implementation to fail causing an error with the following signature: "InexactError: trunc(Int32, 4294967296)".

For me, appending the nrm2 wrapper definition to include the 64 bit variants of the nrm2 functions, namely cublasDnrm2_v2_64, cublasSnrm2_v2_64, cublasDznrm2_v2_64, cublasScnrm2_v2_64, presents the best solution.

The PR #2269 provides a suggested fix :)