reactor / reactor-netty

TCP/HTTP/UDP/QUIC client/server with Reactor over Netty

Home Page:https://projectreactor.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Connections warming up

Brieux1 opened this issue · comments

Hello,
We are migrating an application to Spring Webflux and are having questions concerning the netty client warm up.
Our application needs a lot of time/requests before the netty is ready.

Expected Behavior

After warming up (few requests), connection pool is ready to handle incoming requests.

Actual Behavior

We warm up the application by calling the web service manually several times (~10 times) and the responses are OK.

Then we start a load testrun (~400 requests/sec for 1 minute).
The results are KO (see failed_run screenshot attached) after processing thousands of requests.
failed_run

Then we launch the same test after a few seconds and the results are OK (see successful_run screenshot attached).
successful_run

Context:

  • our application exposes 1 web service through a rest controller
  • the application then make several call to an other application (simulated by a wiremock application in our testing environment)

Our webclient configuration:
ConnectionProvider provider = ConnectionProvider .builder("custom-http") .pendingAcquireTimeout(Duration.ofSeconds(10)) .maxConnections(200) .pendingAcquireMaxCount(10000) .maxIdleTime(Duration.ofSeconds(30)) .maxLifeTime(Duration.ofSeconds(60)) .evictInBackground(Duration.ofSeconds(120)) .disposeInactivePoolsInBackground(Duration.ofSeconds(180), Duration.ofSeconds(175)) .build();
HttpClient httpClient = HttpClient.create(provider) .metrics(true, s -> s) .resolver(DefaultAddressResolverGroup.INSTANCE) .option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 2000) .responseTimeout(Duration.ofMillis(2000)) .doOnConnected(conn -> { conn.addHandlerLast(new IdleStateHandler(3000, 3000, 3000, TimeUnit.MILLISECONDS)); conn.addHandlerLast(new WriteTimeoutHandler(5000, TimeUnit.MILLISECONDS)); conn.addHandlerLast(new ReadTimeoutHandler(5000, TimeUnit.MILLISECONDS)); } );

We can see that on the first run the netty client use nearly 200 active connections, and is stacking requests in pending connections.
Whereas on the second run only 45 connections are necessary to process all requests.
connections

Why does the client needs so much time/requests to be ready?
We tested several configuration (several timeouts, pool sizes, using the blocking warmup() method in the initialization of the client) without improvements.

Your Environment

Our testing environment:

  • kubernetes hosted on AWS
  • mocked partner (aka "simulateurs-service" in our context): 5 pods in the same namespace
  • Reactor version(s) used: 3.6.2
  • org.springframework.boot: 3.2.2
  • JVM version (java -version): openjdk 17.0.10

@Brieux1 You wrote that you did some warmup (several requests), but is it enough to fill the connection pool in order to be ready to handle the load?
Is it possible that with the first run, what you see is actually filling the connection pool i.e. establishing connections to the remote system, while with the second run the connection pool contains already connections ready to handle the load.

Thank you for the response.
After some new tests, it appears that if we stop the 1st run after 20sec, then it is sufficient to handle the 2nd run.
Also if we extend the duration of the rampup to 40sec, then the application is able to handle all requests of the 1st run properly.

My misunderstanding was about the connection pool filling that was done during the first seconds of the 1st run (over 3k requests handled) that failed to handle the following requests (of the 1st run).
So at that point, the application can actually handle the load if we can manage to ensure a clean rampup (i.e. ensuring a connection pool filling before routing all the requests to the application).

Do you confirm that there is no possibility to programmatically initialize that connection pool?

@Brieux1 Currently we do not expose configuration for warm up the connection pool. This is mostly because the load balancer etc. have aggressive idle timeout, so in real production scenario, even if you warm up the connection pool it is highly likely that the idle connections will be close very quickly.

@Brieux1 I'm going to close this. For the time being there are no plans to expose a warm up functionality for the connection pool having in mind the reasons that I shared above.