Specify DLPack helper C APIs

Question

Specify DLPack helper C APIs

leofang opened this issue 3 years ago · comments

In the recent discussions scattered everywhere, it appears that some functions should better be just implemented by DLPack so that the downstream libraries do not have to reinvent the wheels. The possibilities include

tensor deleter (see the discussion starting #51 (comment))
query if CPU-accessible (see #71 (comment))
C APIs corresponding to __dlpack__ and __dlpack_device__ in the Python Array API standard for handing streams (#65)
C API that could be exposed as a new Python attribute __dlpack_info__ for returning API and ABI versions (and potentially more, see #34, #72)

cc: @tqchen @rgommers @seberg @eric-wieser @kkraus14 @jakirkham @hameerabbasi @vadimkantorov @oleksandr-pavlyk @szha @veritas9872

Leo Fang · Answer 1 · Sat Jun 05 2021 02:31:07 GMT+0800 (China Standard Time)

cc: @kmaehashi @emcastillo @dalcinl (for vis)

Leo Fang · Answer 2 · Wed Jul 28 2021 15:54:34 GMT+0800 (China Standard Time)

Hi guys, I am bringing some steroid to stimulate progress toward closing this issue!

C API that could be exposed as a new Python attribute __dlpack_info__ for returning API and ABI versions (and potentially more, see Future ABI compatibility #34, Add ABI version. #72)

The need to retrieve API/ABI versions has been brought up repeatedly in several occasions that I am aware of, including in a recent weekly Array API call. Furthermore, @emcastillo has kindly demonstrated a toy case to show that #71 can be a backward incompatible change:

import torch
import torch.utils.dlpack
import cupy

# Use managed memory
cupy.cuda.set_allocator(cupy.cuda.MemoryPool(cupy.cuda.malloc_managed).malloc)

def get_torch_managed(size):
    return torch.utils.dlpack.from_dlpack(cupy.empty(size).toDlpack())

a = get_torch_managed((3, 2, 1))

The code here uses CuPy to allocate CUDA managed memory and exposes it to PyTorch. The problem is that before #71 landed in DLPack v0.6 managed memory could be disguised as normal device memory by library developers in order to facilitate the exchange. However, if the Producer (CuPy here) supports kDLCUDAManaged from #71 but the Consumer (PyTorch here) does not, the above code snippet breaks during hand-shaking as the Consumer does not recognize it.

tl;dr: We really need to be able to query the supported API/ABI version.

Sebastian Berg · Answer 3 · Thu Jul 29 2021 07:41:13 GMT+0800 (China Standard Time)

tl;dr: We really need to be able to query the supported API/ABI version.

In the example wouldn't you also need to request the API version (as in "Please give me the struct, with a max version of X.Y")? The exporter (which supports the newer version) must know that the importer is stuck on the older one.

As helpful as signalling the exported version is, it does not replace a Python side (i.e. __dlpack__ spec) for requesting an older version, that would be necessary to do more than raise an informative error that the exported version is too new.

Leo Fang · Answer 4 · Fri Aug 13 2021 12:07:12 GMT+0800 (China Standard Time)

Sorry @seberg I dropped the ball...

as in "Please give me the struct, with a max version of X.Y"

I am not sure if I understand it correctly. I assume you meant if the Consumer only supports v0.5, the Producer (on v0.6) should be able to export a v0.5 struct. But, wouldn't this lead to a huge maintenance burden to keep track of all versions of DLPack as time goes by?

I would incline to have an error raised here, but currently the errors raised in each library for such a scenario is cryptic; as of now the informative error ("the exported version is too new") you said is not possible to generate due to lack of API version-query...

Sebastian Berg · Answer 5 · Fri Aug 13 2021 22:27:34 GMT+0800 (China Standard Time)

Yeah, that is what I thought. I am not sure that in this case the "usupported (new?) device" error would be worse than "incompatible version", but then you could argue that adding devices could just be a minor version bump anyway.

Even without requesting, an exporter could choose to export an old version for easier compatibility (if it has feature parity for the use-case). I guess requesting a version could tape over some transitions, but not sure that it is important.