littlefs-project / littlefs

A little fail-safe filesystem designed for microcontrollers

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pread/pwrite

yamt opened this issue · comments

it would be nice to support pread/pwrite natively.

while an emulation with seek is possible, i guess it can have a few drawbacks:

  • embedders likely need a recursive lock to avoid races
  • it's likely inefficient as littlefs's seek also moves cache position

Hi @yamt, thanks for opening an issue.

I'm glad you proposed this, because I only learned pread/pwrite are in standard POSIX recently. It makes sense as an API and I think it would be good to add.

pread/pwrite also shouldn't add much code cost as most of the logic should be shared behind the scenes.

it's likely inefficient as littlefs's seek also moves cache position

The reason littlefs's seek moves the cache is because we usually need it for reading/writing, so I don't think pread/pwrite will escape this cost. Maybe when pread/pwrite bypasses the cache.


That being said, non-disk-format related API changes are low priority at the moment. So it may take some time for this to land.

it's likely inefficient as littlefs's seek also moves cache position

The reason littlefs's seek moves the cache is because we usually need it for reading/writing, so I don't think pread/pwrite will escape this cost. Maybe when pread/pwrite bypasses the cache.

i meant that, a straightforward emulation with lseek like the following would be inefficient as it moves
the position back and forth. a native implementation can be better.

lfs_file_pread(off) {
    orig_pos = lfs_file_seek(CUR);
    lfs_file_seek(off, SET);
    lfs_file_read();
    lfs_file_seek(orig_pos, SET);
}

A native implementation could save a CTZ skip-list lookup, that's true. And there may be other opportunities for minor optimizations.

But caching is a bit of a harder problem, since lfs_file_pread doesn't know what the next operation will be. It may be followed by a pread immediately after if above layers emulate read with multiple pread calls, for example.

But caching is a bit of a harder problem, since lfs_file_pread doesn't know what the next operation will be. It may be followed by a pread immediately after if above layers emulate read with multiple pread calls, for example.

are you concerning cache-hit rate here?

Ah yeah. Cases like these:

// write blob
pwrite(f, 0, <data_blob>)

// scan for start of payload
lfs_off_t payload_start = 0;
while (true) {
    pread(f, payload_start, buf, strlen("<header>"));
    if (strcmp(buf, "<header>") == 0) {
        break;
    }

    payload_start += 1;
}

You could argue this should be done with seek + read, but I'm not sure that's always possible when other OS layers get involved. FUSE for example only talks in pread/pwrite (fuse_lowlevel_ops.read).

Though, as an "optimization problem", there may not be a single correct impl here.

Though I suppose you could also argue the FUSE impl should still call seek + read if only for cache reasons.

Maybe the best option is unclear. Not moving the cache in pread/pwrite at least gives users more control, even though the performance differences may be initially confusing to users.