[FEA] Use CUDA Arrow IPC primitives to read and write DFs in GPU memory
trxcllnt opened this issue · comments
The Arrow IPC primitives support reading and writing Tables and Columns in GPU memory. We should add support for reading the Arrow IPC format when the input data is a CUDA buffer, as well as writing DFs to CUDA buffers of the Arrow IPC format.
This would allow us to easily serialize a DataFrame to GPU memory, share that memory with multiple processes (via CUDA IPC), and allow those processes to zero-copy read the Arrow Table from the shared memory pointer and use its buffers as the backing storage for a DataFrame.
cuDF Python has support for zero-copy reading the Arrow IPC format stored in a CUDA buffer 1 2 with a bit of help from libcudf 3. It doesn't support writing the Arrow IPC format to a CUDA buffer, but we should be able to use the reading logic as a guide.
Done in #250