Extra data copy during write
asomers opened this issue · comments
During a write, fuse3 first copies data from the kernel into userland in Session::dispatch
. Then it passes a slice of that buffer to handle_write
, which ends up copying the data again into a new Vec
. It then passes that data as a slice to Filesystem::write
, where it might well be copied again. The same thing happens in setxattr
.
Instead, Session::dispatch
should read from /dev/fuse using readv
into a header-sized buffer and a large data buffer. Then it should pass the data buffer by value to Filesystem::write
using a Vec
. That would eliminate one data copy, and possibly two, depending on how the file system implements write
.
use writev
should avoid memory copy, we own the header buffer and user data(such as Filesystem::read
will return Bytes
)
when read fuse request, we can allocate 2 buffer, one for header the other for fuse data, when receive a write opcode, consider
The max size of write requests from the kernel. The absolute minimum is 4k, FUSE recommends at least 128k, max 16M. The FUSE default is 16M on macOS and 128k on other systems.
the data may be large or small
- small like 4K size data: if we pass the data buffer to
Filesystem::write
, we need to allocate the data buffer(the buffer size is 16M) again - large like 15M size data: we pass the data buffer to
Filesystem::write
then we allocate the data buffer again, but this is no different from the status quo.
anyway, we can replace read/write with readv/writev at first, then find a way to improve write opcode
BTW, the maximum size of write that a filesystem will receive is given by the max_write
field during FUSE_INIT
. So it could be much less than 16M.