ConcurrentLru's "GetOrAdd" vs ConcurrentDictionary's "GetOrAdd" behave differently

Question

ConcurrentLru's "GetOrAdd" vs ConcurrentDictionary's "GetOrAdd" behave differently

tcsaddul opened this issue 3 years ago · comments

        var lru = new ConcurrentLru<int, User>(5);
        var cur = new ConcurrentDictionary<int, User>();

        for (int i = 0; i < 5; i++)
        {
            lru.GetOrAdd(i, k =>
            {
                Console.WriteLine("Called LRU " + i);
                return new User { Id = i, UserName = "Name " + k };
            });

            cur.GetOrAdd(i, k =>
            {
                Console.WriteLine("Called Dict " + i);
                return new User { Id = i, UserName = "Name " + k };
            });
        }

        for (int i = 0; i < 5; i++)
        {
            lru.GetOrAdd(i, k =>
            {
                Console.WriteLine("Called Again LRU " + i);
                return new User { Id = i, UserName = "Name " + k };
            });

            cur.GetOrAdd(i, k =>
            {
                Console.WriteLine("Called Again Dict " + i);
                return new User { Id = i, UserName = "Name " + k };
            });
        }

        Console.ReadLine();

Output:

Alex Peck · Answer 1 · Mon Jul 12 2021 09:56:47 GMT+0800 (China Standard Time)

This is a quirk of the internal details that is more apparent when capacity is very small, and items are accessed in a sequence that causes them to be evicted before they are requested the second time.

Internally, ConcurrentLru maintains 3 queues: hot, warm and cold. When initialized with capacity = 5, each queue has a size of capacity/3. So, after rounding down each queue has a size of 1 internally.

When an item is fetched, it is first added to the hot queue. When the hot queue is full and an item is fetched, the last item in the hot queue is moved to either warm or cold depending on whether it has been accessed since it was added. The same process is followed for the warm and cold queues. When an item is fetched, it is marked as touched. When the queues are cycled, touched items in the warm queue and cold queues are moved back to the head of warm. Untouched items in warm are moved to cold. Untouched items in cold are evicted.

So, in your repro, the sequence of steps is laid out below (and can be confirmed by stepping through in the debugger and looking at the internal state of the LRU).

Initial state, all queues empty and have capacity 1:
H[], W[], C[]

First for loop (items are added):
i = 0. H[0], W[], C[]
i = 1. H[1], W[], C[0]
i = 2. H[2], W[], C[1]
i = 3. H[3], W[], C[2]
i = 4. H[4], W[], C[3]

Second for loop (items are fetched second time):
i = 0. H[0], W[], C[4]
i = 1. H[1], W[], C[0]
i = 2. H[2], W[], C[1]
i = 3. H[3], W[], C[2]
i = 4. H[4], W[], C[3]

Since each queue has a capacity of 1, the items are cycled out of hot first to cold and then evicted before they are fetched again in the second for loop.

If your for loop fetches only 2 items, ConcurrentLru will work as you expect with capacity 5. If you loop over 5 items and set capacity to 9, items will not be evicted (each queue will have size = 3, so you will have 3 items in hot and 2 in cold) before the second loop runs. Alternatively, you can use ClassicLru which will perform worse under heavy concurrent load, but will maintain exact item order. Note that if you use ClassicLru, with size 5 and fetch in sequence 6 items, the same problem will occur.

The ConcurrentLru algorithm is a pseudo LRU and gives approximate LRU order - it is trading lack of exact count/sequence to give the best possible performance. In most practical settings, in sequence fetches aren't common - I mostly tested using a Zipf distribution which is fairly standard in academic studies.

I can make this slightly better by fixing the rounding that occurs when capacity is not an exact multiple of 3 - this is a bug. For example, if the capacity is set to 5, I can make the hot and cold queues size 2. This means there will be the correct number of internal slots H:2 + W:1 + C:2 = 5. But for your repro, the sequence of fetches would still force all items to be evicted even with the fix. This is because you are fetching 5 items in sequence before fetching again in sequence, but the capacity of hot + cold is be 4 (warm queue is never used because no item is fetched before it is evicted). So capacity is exceed before items are fetched again.

In general, to get stable results for repeated sequence fetches of length n, ConcurrentLru capacity must be set to at least (n+1) + (n+1)/3.

Alex Peck · Answer 2 · Tue Jul 13 2021 04:22:38 GMT+0800 (China Standard Time)

Will document how it works here: https://github.com/bitfaster/BitFaster.Caching/wiki/ConcurrentLru