rapidsai / node

GPU-accelerated data science and visualization in node

Home Page:https://rapidsai.github.io/node/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DataFrame.fromArrow crashes with CPU buffer

bryevdv opened this issue · comments

> {DataFrame} = require("@rapidsai/cudf")

> const data = fs.readFileSync('modules/demo/spatial/data/263_tracts.arrow')

> data
<Buffer ff ff ff ff 68 01 00 00 10 00 00 00 00 00 0a 00 0c 00 06 00 05 00 08 00 0a 00 00 00 00 01 04 00 0c 00 00 00 08 00 08 00 00 00 04 00 08 00 00 00 04 00 ... 718182 more bytes>

> DataFrame.fromArrow(data)
/opt/node-rapids/modules/.cache/cpm/arrow/395ed2335696bfe47705f21d4c7a106d9c3d1c76/cpp/src/arrow/result.cc:28: ValueOrDie called on an error: IOError: Cuda error 1 in function 'cuMemcpyDtoH': [CUDA_ERROR_INVALID_VALUE] invalid argument
/opt/node-rapids/modules/.cache/build/Debug/arrow-build/debug/libarrow.so.400(_ZN5arrow4util7CerrLog14PrintBackTraceEv+0x39)[0x7fd2a95b7599]
/opt/node-rapids/modules/.cache/build/Debug/arrow-build/debug/libarrow.so.400(_ZN5arrow4util7CerrLogD1Ev+0x5f)[0x7fd2a95b750f]
/opt/node-rapids/modules/.cache/build/Debug/arrow-build/debug/libarrow.so.400(_ZN5arrow4util7CerrLogD0Ev+0x1c)[0x7fd2a95b7534]
/opt/node-rapids/modules/.cache/build/Debug/arrow-build/debug/libarrow.so.400(_ZN5arrow4util8ArrowLogD1Ev+0x4b)[0x7fd2a95b7351]
/opt/node-rapids/modules/.cache/build/Debug/arrow-build/debug/libarrow.so.400(_ZN5arrow8internal14DieWithMessageERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x5c)[0x7fd2a93ac760]
/opt/node-rapids/modules/.cache/build/Debug/arrow-build/debug/libarrow.so.400(_ZN5arrow8internal17InvalidValueOrDieERKNS_6StatusE+0x88)[0x7fd2a93ac824]
/opt/node-rapids/modules/cudf/build/Debug/node_cudf.node(_ZNO5arrow6ResultISt10shared_ptrINS_3ipc23RecordBatchStreamReaderEEE10ValueOrDieEv+0x4b)[0x7fd2b4ed0a07]
/opt/node-rapids/modules/cudf/build/Debug/node_cudf.node(_ZN2nv5Table10from_arrowERKN4Napi12CallbackInfoE+0x272)[0x7fd2b4ecd932]
/opt/node-rapids/modules/cudf/build/Debug/node_cudf.node(+0x2d2618)[0x7fd2b4ec4618]
/opt/node-rapids/modules/cudf/build/Debug/node_cudf.node(+0x2d3945)[0x7fd2b4ec5945]
/opt/node-rapids/modules/cudf/build/Debug/node_cudf.node(+0x2d26b0)[0x7fd2b4ec46b0]
nodejs[0xa390bf]
nodejs[0xce224b]
nodejs[0xce37fc]
nodejs(_ZN2v88internal21Builtin_HandleApiCallEiPmPNS0_7IsolateE+0x16)[0xce3e76]
nodejs[0x15046d9]
Aborted (core dumped)

comment from @trxcllnt

I think the second thing happens because we don't check whether the input is a host buffer, and Arrow explicitly does cudaMemcpyDtoH() somewhere

https://github.com/rapidsai/node-rapids/blob/88caf60f20292a4ee306a648ea432280ab29bd94/modules/cudf/src/table/arrow.cpp#L69-L71