dmlc / dlpack

common in-memory tensor structure

Home Page:https://dmlc.github.io/dlpack/latest

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to share data without requiring consumer to "own" the input tensor?

galv opened this issue · comments

I would like to have pytorch code call a subroutine that takes uses DLPack, such that the subroutine is generic across frameworks. However, I noticed that the interface provided by DLPack allows compliant interfaces to output only DLManagedTensor, not DLTensor, requiring the consumer subroutine to take ownership of the input. Indeed, the documentation says: "The consumer must transer ownership of the DLManangedTensor from the capsule to its own object."

This is no good if you want the input tensor to continue to be used after the subroutine. Here is a small example to describe what I want to do:

my_model = Model()
output = my_model(input)
result = foreign_library.function_accepting_dl_tensor(output.detach())
print(output) # Accessing deleted memory

Is DLPack just no good for this use case? __dlpack__() outputs a DLManagedTensor (well, a capsule refering to a DLManagedTensor), which forces the consumer to take ownership (i.e., it implements what C++ programmers call "move semantics").

I suppose one work around might be:

my_model = Model()
output = my_model(input)
output_managed_tensor = output.detach().__dlpack__()
result = foreign_library.function_accepting_dl_tensor(output_managed_tensor)
output2 = torch.from_dlpack(output_managed_tensor)
print(output2) # No longer accessing deleted memory

However, this causes the chain of deleters (the deleter function pointer has to call the original function pointer) to grow every time this rigamarole happens, so it seems very non ideal.

Is there basically no way to access just a DLTensor instead of a DLManagedTensor if I want reference semantics instead of move semantics?

Mildly curious if you encountered this in your nanobind project's ndarray implementation and have any thoughts you'd be willing to share @wjakob.

It is possible for the tensor continue to be shared, the mechanism is that making DLTensor export incref the tensor, and have DLManagedTensor's deleter decref the reference counter, rather than perform the deletion. I believe that was also the common implementation of many framework integration already

We have developed a way to share the DLPack tensor without following the stringent requirements placed by the DLPack protocol. We term the new protocol as the "producer only DLPack" protocol. You can read it in our arxiv paper (https://arxiv.org/pdf/2404.04118): Section IV-A.

Is this something, you were looking for?