DataFrame.fromArrow crashes with CPU buffer
bryevdv opened this issue · comments
> {DataFrame} = require("@rapidsai/cudf")
> const data = fs.readFileSync('modules/demo/spatial/data/263_tracts.arrow')
> data
<Buffer ff ff ff ff 68 01 00 00 10 00 00 00 00 00 0a 00 0c 00 06 00 05 00 08 00 0a 00 00 00 00 01 04 00 0c 00 00 00 08 00 08 00 00 00 04 00 08 00 00 00 04 00 ... 718182 more bytes>
> DataFrame.fromArrow(data)
/opt/node-rapids/modules/.cache/cpm/arrow/395ed2335696bfe47705f21d4c7a106d9c3d1c76/cpp/src/arrow/result.cc:28: ValueOrDie called on an error: IOError: Cuda error 1 in function 'cuMemcpyDtoH': [CUDA_ERROR_INVALID_VALUE] invalid argument
/opt/node-rapids/modules/.cache/build/Debug/arrow-build/debug/libarrow.so.400(_ZN5arrow4util7CerrLog14PrintBackTraceEv+0x39)[0x7fd2a95b7599]
/opt/node-rapids/modules/.cache/build/Debug/arrow-build/debug/libarrow.so.400(_ZN5arrow4util7CerrLogD1Ev+0x5f)[0x7fd2a95b750f]
/opt/node-rapids/modules/.cache/build/Debug/arrow-build/debug/libarrow.so.400(_ZN5arrow4util7CerrLogD0Ev+0x1c)[0x7fd2a95b7534]
/opt/node-rapids/modules/.cache/build/Debug/arrow-build/debug/libarrow.so.400(_ZN5arrow4util8ArrowLogD1Ev+0x4b)[0x7fd2a95b7351]
/opt/node-rapids/modules/.cache/build/Debug/arrow-build/debug/libarrow.so.400(_ZN5arrow8internal14DieWithMessageERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x5c)[0x7fd2a93ac760]
/opt/node-rapids/modules/.cache/build/Debug/arrow-build/debug/libarrow.so.400(_ZN5arrow8internal17InvalidValueOrDieERKNS_6StatusE+0x88)[0x7fd2a93ac824]
/opt/node-rapids/modules/cudf/build/Debug/node_cudf.node(_ZNO5arrow6ResultISt10shared_ptrINS_3ipc23RecordBatchStreamReaderEEE10ValueOrDieEv+0x4b)[0x7fd2b4ed0a07]
/opt/node-rapids/modules/cudf/build/Debug/node_cudf.node(_ZN2nv5Table10from_arrowERKN4Napi12CallbackInfoE+0x272)[0x7fd2b4ecd932]
/opt/node-rapids/modules/cudf/build/Debug/node_cudf.node(+0x2d2618)[0x7fd2b4ec4618]
/opt/node-rapids/modules/cudf/build/Debug/node_cudf.node(+0x2d3945)[0x7fd2b4ec5945]
/opt/node-rapids/modules/cudf/build/Debug/node_cudf.node(+0x2d26b0)[0x7fd2b4ec46b0]
nodejs[0xa390bf]
nodejs[0xce224b]
nodejs[0xce37fc]
nodejs(_ZN2v88internal21Builtin_HandleApiCallEiPmPNS0_7IsolateE+0x16)[0xce3e76]
nodejs[0x15046d9]
Aborted (core dumped)
comment from @trxcllnt
I think the second thing happens because we don't check whether the input is a host buffer, and Arrow explicitly does cudaMemcpyDtoH() somewhere