Memory Leak in Solandra/Solr
pcoleman opened this issue · comments
When handling a large number of solr queries, we noticed there was a memory leak. Each query seems to add to the memory used, and eventually it forces solr to try to free up memory which fails and gets stuck in an infinite loop.
I did some profiling on Solandra and it looks like that most of the memory is dedicated to document fields and the actual field objects (in our case strings). From that it looks like a run away cache, but like the default settings I have all the caches disabled.
I'm including the solr and and solandra settings below:
Solandra:
solandra.compression = true
solandra.consistency = QUORUM
solandra.cache.invalidation.check.interval = 1000
solandra.maximum.docs.per.shard = 1048576
solandra.index.id.reserve.size = 65536
solandra.shards.at.once = 4
solandra.write.buffer.queue.size = 200
solandra.keyspace = L
cassandra.retries = 1024
cassandra.retries.sleep = 100
Solr:
<config>
<abortOnConfigurationError>true</abortOnConfigurationError>
<lib dir="../../" />
<lib dir="../../lib" />
<lib dir="../../config" />
<updateHandler class="solandra.SolandraIndexWriter"/>
<indexReaderFactory name="IndexReaderFactory" class="solandra.SolandraIndexReaderFactory"/>
<query>
<maxBooleanClauses>1024</maxBooleanClauses>
<enableLazyFieldLoading>true</enableLazyFieldLoading>
<fieldValueCache
class="solr.LRUCache"
size="0"
initialSize="0"
autowarmCount="0"/>
<queryResultWindowSize>20</queryResultWindowSize>
<queryResultMaxDocsCached>200</queryResultMaxDocsCached>
<listener event="newSearcher" class="solr.QuerySenderListener">
<arr name="queries">
</arr>
</listener>
<listener event="firstSearcher" class="solr.QuerySenderListener">
<arr name="queries">
</arr>
</listener>
<useColdSearcher>false</useColdSearcher>
<maxWarmingSearchers>4</maxWarmingSearchers>
</query>
<requestDispatcher handleSelect="true" >
<requestParsers enableRemoteStreaming="true" multipartUploadLimitInKB="2048000" />
<httpCaching never304="true">
</httpCaching>
</requestDispatcher>
The rest is same as default.
This happens when you don't do any writes the caches will keep getting larger to accommodate more terms. The workaround is to send some writes or updates to the index. This will flush the caches. In the meantime, the caches should be changed to use an LRU