Where noCACHE ?

Question

Where noCACHE ?

pavlinux opened this issue 11 years ago · comments

Recursively find in linux kernel tree:

$ time ./nocache find /media/kernel/linux/
...
real    0m12.242s
user    0m1.219s
sys     0m0.868s

$ time ./nocache find /media/kernel/linux/
real    0m1.963s
user    0m1.015s
sys     0m0.475s

At first time - 12 seconds, at next time - 2 sec. :)

Julius Plenz · Answer 1 · Mon Jun 17 2013 06:35:22 GMT+0800 (China Standard Time)

This is an issue related to file metadata. nocache works on the content level, i.e. when reading or writing to the file. If you use find, nocache will have no effect. The difference in timing comes from the fact that by the second run the stat() calls return cached metadata.

Pavel Vasilyev · Answer 2 · Mon Jun 17 2013 06:57:42 GMT+0800 (China Standard Time)

So, it is necessary to implement! :)

Julius Plenz · Answer 3 · Mon Jun 17 2013 07:15:53 GMT+0800 (China Standard Time)

I disagree. File metadata caching is really useful almost always and takes up very little memory, I would think.

onlyjob · Answer 4 · Mon Jun 17 2013 11:07:26 GMT+0800 (China Standard Time)

If it is possible to implement then command-line option will suit everybody. :)

As for "very little memory" it depends... For example I have tree with over 12 million files in it. It takes over the hour to walk it and list of files takes well above 8 GiB so even utilities like md5deep or sha1deep choke and can't finish scanning.... Now imagine you need to walk such tree once a day during backup or even just occasionally find a file in there... Surely any valuable data in cache will be displaced from the pressure of tree index and that's inevitably will cause performance degradation in all running processes as there is little chance for cache hit until next backup somewhat 20 hours (or more) later. FYI this tree occupy less than 700 GiB while I have another storage with combined capacity of ~20,000 GiB so you can imagine the scale of a problem...

Julius Plenz · Answer 5 · Mon Jun 17 2013 16:33:51 GMT+0800 (China Standard Time)

Ok, I see your point. Actually I have no idea how to prevent metadata caching (or discard the data after use), but I’ll have a look at it some time this week. Reopening the issue for now.

Reda NOUSHI · Answer 6 · Tue Jun 18 2013 17:04:13 GMT+0800 (China Standard Time)

Hi all,
@onlyjob : in standard linux distros, you would actually have another daily full system scan: updatedb for the locate facility.

That said, a systemwide solution (so not as specific as nocache) would be to run:

echo 2 > /proc/sys/vm/drop_caches  # free dentries and inodes
sync

you can check the memory usage of the inode cache by running

slabtop

and looking for *_inode_cache like ext4_inode_cache.

you can also tell your kernel to prefer freeing inode cache by setting a value > 100 for /proc/sys/vm/vfs_cache_pressure. This is also a systemwide setting but it's less disruptive than the drop_cache method.

I haven't found yet how to selectively with the existing kernel code base, and frankly I don't know if it's possible, but it would be straightforward to implement a kernel patch to do this:
. for each fs of interest (ext4, xfs...), add a function to drop inode cache for a specific file (in ext4, you would act on ext4_inode_cachep) and call it from the fs driver own ioctl implementation (ext4_ioctl() for ext4).
. in fs/ioctl.c and fs/compat_ioctl.c , add a new ioctl flag to act on a specific file.

You could also see how shrink_slab() in mm/vmscan.c works, and transpose that to what you need, but you have to find out how to find the driver_mnt_point_inode for each file, which is exactly what the vfs layer does.

Hope it helps, cheers,
Reda

onlyjob · Answer 7 · Tue Jun 18 2013 18:03:55 GMT+0800 (China Standard Time)

updatedb is easy to control using its config file "/etc/updatedb.conf". The idea is not to drop caches (which is worthless) but to preserve their contents. :) It would be interesting to see how application can be restricted to use only some cache memory but not all of it...

Reda NOUSHI · Answer 8 · Tue Jun 18 2013 19:05:19 GMT+0800 (China Standard Time)

@onlyjob yes, it is an interesting topic, could you also take a look at the kernel source to see if I missed some way to do what @pavlinux wants?

onlyjob · Answer 9 · Wed Jun 19 2013 17:40:02 GMT+0800 (China Standard Time)

On Tue, 18 Jun 2013 21:05:19 Reda NOUSHI wrote:

@onlyjob, could you also take a look at the kernel source to see if I missed some way to do what @pavlinux wants.

I'm flattered that you think I have skills to do that. ;)
I really have neither time nor expertise to analyse linux kernel
source...

Regards,
Dmitry.

Reda NOUSHI · Answer 10 · Thu Jun 20 2013 04:23:14 GMT+0800 (China Standard Time)

@onlyjob :) well, there's a beginning for everything!
I need a peer review @Feh, @pavlinux ... :hint:

Julius Plenz · Answer 11 · Thu Jun 20 2013 06:20:02 GMT+0800 (China Standard Time)

From a quick read over the Kernel source I think it’s not easily possible. The best hint is still fs/drop_caches.c. Since

echo 2 > /proc/sys/vm/drop_caches  # free dentries and inodes

is what we’d like to do, but for specific files, let’s have a look what this actually does: It just shrinks the slab, regardless of what’s present. Here is the full implementation:

static void drop_slab(void)
{
        int nr_objects;
        struct shrink_control shrink = {
                .gfp_mask = GFP_KERNEL,
        };

        do {
                nr_objects = shrink_slab(&shrink, 1000, 1000);
        } while (nr_objects > 10);
}

So my guess is if we were to implement this it would take a few dozen lines of Kernel code; but probably a Kernel module, which would mean the user requires root privileges.

@onlyjob: How about adjusting /proc/sys/vm/vfs_cache_pressure?

At the default value of vfs_cache_pressure = 100 the kernel will attempt to reclaim dentries and inodes at a “fair” rate with respect to pagecache and swapcache reclaim. Decreasing vfs_cache_pressure causes the kernel to prefer to retain dentry and inode caches. Increasing vfs_cache_pressure beyond 100 causes the kernel to prefer to reclaim dentries and inodes.