Channel cache leak when no answers from broker for pending confirms
artembilan opened this issue · comments
Discussed in #2637
Originally posted by arn-rio February 26, 2024
Hello,
We experienced some issues using correlated Rabbit confirm publishers.
Rabbit server was unstable for a while. Once restored, we were unable to publish new confirmed messages to it (the max number of channel on connection was reached and the existing channels were ignored).
The solution was to restart the service (we could force close the connection : but it only worked when channel cache size wasn't set in CachingConnectionFactory
)
We investigated and found that :
- when a message is published but not confirmed, assigned channel is not used for an other message. (see spring amqp documentation )
- so
CachingConnectionFactory
creates new channels, and then number of channel in connection increases. - Finally, there is no channel available on the connection : max number of channels reached
when a new channelCacheSize and channelCheckoutTimeout are set in CachingConnectionFactory, we got 'No available channels' error, otherwise, 'The channelMax limit is reached'.
We call RabbitTemplate.getUnconfirmed()
in a periodic task : the correlation data are removed, but unfortunatly, the related channel is not freed and not available for new publish.
You can observe and reproduce the issue as follows :
- a periodic task that publishes a confirmed message on Rabbit every 20 ms
- a cron job that calls
RabbitTemplate.getUnconfirmed()
- on a local Rabbit server, block then unblock connection by updating memory watermark
//block connections
rabbitmqctl set_vm_memory_high_watermark <low threshold>
//unblock
rabbitmqctl set_vm_memory_high_watermark <high threshold>
Is there a better way to manage not confirmed messages? How can we free the channels when confirmation is not received before a timeout?
Thanks