Implicit zero copy on `Vec<u8>`

Question

Implicit zero copy on `Vec<u8>`

temeddix opened this issue a year ago · comments

Hi, thank you very much for this library. This post is an idea rather than an issue.

I've been wondering why we would need to copy Vec<u8> when we have the ability to 'zero copy' the bytes array. What if we could just zero copy Vec<u8> types by default without explicit ZeroCopyBuffer? If I can contribute, would it be okay to work on this idea?

Donghyun Kim commented a year ago

Thanks!

shekohex · Answer 1 · Wed May 24 2023 20:19:25 GMT+0800 (China Standard Time)

Hi @temeddix

Thank you for your feedback and interest in contributing to the allo-isolate project. I appreciate your idea of enabling zero-copy behavior by default for Vec types.

While zero-copy can be beneficial in terms of performance and memory efficiency, I believe it's important to make it an opt-in feature rather than opt-out. allo-isolate aims to be a low-level crate that provides developers with full control over the interaction between Rust and Dart. We have examples and documentation that explain the difference between zero-copy and non-zero-copy approaches.

By leaving the default behavior as it is, we ensure that developers consciously choose the "best option" based on their specific use cases. Opting out of a feature, which goes against the purpose of having features, wouldn't align with the crate's philosophy.

However, I'm definitely open to the idea of introducing a "zero-copy" feature that would allow developers to enable zero-copy behavior for every Vec automatically, without the need for manual wrapping into ZeroCopyBuffer. This would provide convenience for those who prefer zero-copy by default. But, it's important to note that this feature will not be turned on by default—instead, it will be opt-in.

I value your opinion on this matter. If you'd like to proceed with implementing the "zero-copy" feature as an opt-in, I'd be more than happy to review your contributions and discuss further.

Donghyun Kim · Answer 2 · Wed May 24 2023 20:32:33 GMT+0800 (China Standard Time)

I see :) So this idea would be appropriate to be provided via a cargo feature, rather than being a default. Thanks!

fzyzcjy · Answer 3 · Fri May 26 2023 21:28:27 GMT+0800 (China Standard Time)

While zero-copy can be beneficial in terms of performance and memory efficiency, I believe it's important to make it an opt-in feature rather than opt-out.

I agree. Mentally I think the naive copying is the "default" thing, but I cannot convince myself with strong evidence. So I wonder what do you think - why shall we consider the naive copying instead of zero-copy the "default" thing?

For example, in Rust semantics, move (analogy to zero-copy) is the default thing and copy/clone is the secondary opt-in thing. So it is like the opposite.

Donghyun Kim · Answer 4 · Fri May 26 2023 21:39:53 GMT+0800 (China Standard Time)

IMHO zero-copy being the default thing would be not bad.

Copying and moving both take the ownership of the memory from Rust code and therefore move operation is always more efficient. However, I also appreciate @shekohex 's statement that moving and copying is different, so I think we are standing between the 'practicality' and 'the original philosophy'.

shekohex · Answer 5 · Fri May 26 2023 22:17:52 GMT+0800 (China Standard Time)

[...] in Rust semantics, move (analogy to zero-copy) is the default thing and copy/clone is the secondary opt-in thing. So it is like the opposite.

While I agree here, but this case different, the most similar case here from ownership model of Rust, is that Rust allows you to pass a reference to the data so the other end can read from it. In our case, When we use the term "zero-copy" that's what it means as passing a reference of the memory from Rust to Dart. It also this indicates to DartVM that this non-gc memory, and it is managed somewhere else, when you are done call this callback so the original owner can free the backing memory.

In general, Our discussion here is not really related to Rust semantics, more or less the original philosophy behind this crate.

Donghyun Kim · Answer 6 · Fri May 26 2023 22:28:52 GMT+0800 (China Standard Time)

I got curious...

Does it mean when data is sent from Rust to Dart via a zero-copy, it's not garbage collected from Dart? Then, is it something like after the reference count is dropped to zero, it comes back to Rust and dropped at the Rust side?

Do we have to manually drop or dispose the memory at the Rust side after that?

Apologies for so many questions, I'm kind of a newbie here :\

shekohex · Answer 7 · Fri May 26 2023 22:48:37 GMT+0800 (China Standard Time)

When we use the zero-copy feature and mark the data as DartExternalTypedData, DartVM doesn't take ownership of the memory. Instead, it allows you to work with the data as it is. For example, let's say you have an image stored on disk and you want to modify it by applying a black-and-white effect using a Rust function. Instead of saving the modified image back to disk immediately, you may want to show it to the user first. In this case, you can send the image as bytes to DartVM. From there, you can read and display the bytes. However, it's important to note that the ownership of the memory still belongs to the Rust side. When DartVM is finished using the memory (for example, when you've already viewed the image or the user has gone back to the previous page), DartVM will invoke a callback provided by Rust. This callback allows Rust to free the memory and release it back to the system.

Do we have to manually drop or dispose the memory at the Rust side after that?

Nope, the allo-isolate crate already provides that callback to the DartVM to call.

If you want to read more about the internals you can read the comment here: https://github.com/dart-lang/sdk/blob/e213846ba09bc56fe4b0ecd1bf1d88c6153bffd1/runtime/include/dart_native_api.h#L12-L42

fzyzcjy · Answer 8 · Sat May 27 2023 13:19:18 GMT+0800 (China Standard Time)

I see. Interesting points!