spring-projects / spring-amqp

Spring AMQP - support for Spring programming model with AMQP, especially but not limited to RabbitMQ

Home Page:https://spring.io/projects/spring-amqp

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Channel cache leak when no answers from broker for pending confirms

artembilan opened this issue · comments

Discussed in #2637

Originally posted by arn-rio February 26, 2024
Hello,

We experienced some issues using correlated Rabbit confirm publishers.
Rabbit server was unstable for a while. Once restored, we were unable to publish new confirmed messages to it (the max number of channel on connection was reached and the existing channels were ignored).
The solution was to restart the service (we could force close the connection : but it only worked when channel cache size wasn't set in CachingConnectionFactory)

We investigated and found that :

  • when a message is published but not confirmed, assigned channel is not used for an other message. (see spring amqp documentation )
  • so CachingConnectionFactory creates new channels, and then number of channel in connection increases.
  • Finally, there is no channel available on the connection : max number of channels reached
    when a new channelCacheSize and channelCheckoutTimeout are set in CachingConnectionFactory, we got 'No available channels' error, otherwise, 'The channelMax limit is reached'.

We call RabbitTemplate.getUnconfirmed() in a periodic task : the correlation data are removed, but unfortunatly, the related channel is not freed and not available for new publish.

You can observe and reproduce the issue as follows :

  • a periodic task that publishes a confirmed message on Rabbit every 20 ms
  • a cron job that calls RabbitTemplate.getUnconfirmed()
  • on a local Rabbit server, block then unblock connection by updating memory watermark
	//block connections 
	rabbitmqctl set_vm_memory_high_watermark <low threshold>
	//unblock
	rabbitmqctl set_vm_memory_high_watermark <high threshold>

Is there a better way to manage not confirmed messages? How can we free the channels when confirmation is not received before a timeout?

Thanks