ehcache / ehcache3

Ehcache 3.x line

Home Page:http://www.ehcache.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Some cache puts "disappear" in highly concurrent disk-store eternal cache

msqr opened this issue · comments

I've been trying to figure out an issue where some entries that have been added to a disk-persistent non-expiring (eternal) cache are never returned through entry iteration or even seem to exist after manual debugging inspection. I've tried to narrow down the issue the best I can to one test case here, and was hoping someone might take a stab at running this test to see if the problem could be identified.

At a high level, there are 4 "writer" threads adding to the cache, using cache.put().

Then there are 2 "processor" threads iterating over some of the cache keys and then using cache.get(k) followed by cache.remove(k).

Note I have modified the relevant application code with additional logging and commented out some to help focus on the issue. The test is using a delegating cache wrapper, on which only the get(k), put(k,v), and remove(k) methods are relevant to the test and you can see they simply delegate to the Ehcache instance.

Normally everything works OK, but after some heavy "bursts" of writes, some entries are never returned via iteration, and seem to "disappear" as far as the application is concerned. No exceptions are thrown. For the test I've configured the cache with only a disk tier; having an additional on-heap tier did not seem to make a difference.

When this test runs, if I configure the 4 writer threads with a short burst of just a few seconds, everything works fine. When I configure a longer burst (4s does the trick for me) then the problem occurs consistently. At the end of the test, the log output shows the issue like this:

2023-12-05 18:21:32.217 [main] INFO  n.s.c.i.b.dao.AsyncDaoDatumCollector Put: 10305, store: 5675

The Put: 10305 is the number of entries added to the cache by all writer threads, and store: 5675 is the number of entries iterated & removed... these two numbers should be the same, but in this example 4630 entries are "missing". I would really appreciate any insight on this issue!

Sorry for the noise; I was able to simplify the test and have found the issue is not in Ehcache.