org.apache.http.impl.nio.conn.CPool leak after upgrading 5.4.1 -> 6.5.6

Question

org.apache.http.impl.nio.conn.CPool leak after upgrading 5.4.1 -> 6.5.6

siddheshlatkar opened this issue 8 months ago · comments

We have been using version 5.4.1 for over a year without any problems. However, after updating to version 6.5.6, we have noticed a leak.

One instance of "org.apache.http.impl.nio.conn.CPool" loaded by "org.springframework.boot.loader.LaunchedURLClassLoader @ 0x733200000" occupies 352,023,744 (56.88%) bytes. The memory is accumulated in one instance of "java.util.LinkedList" loaded by "<system class loader>".

Our application makes a call to Elastic Search for every GraphQL request and returns data. After the upgrade, the app works well for about five minutes but then fails to make a call to ES, possibly due to the aforementioned leak. We have determined that the issue arises after upgrading the Expedia library version.

Dariusz Kuc · Answer 1 · Fri Nov 10 2023 12:11:33 GMT+0800 (China Standard Time)

Hello 👋
Based on my brief search it looks like org.apache.http.impl.nio.conn.CPool is part of org.apache.httpcomponents:httpasyncclient library. I don't see this as a dependency of graphql-kotlin-spring-server v6.5.6 (which uses SpringBoot v2.7.2). Double check your project dependencies to see where that dependency is coming from and upgrade it accordingly.

If there is an issue with graphql-kotlin please provide a link to a github repo that reproduces the issue.

siddheshlatkar · Answer 2 · Sat Nov 18 2023 04:55:26 GMT+0800 (China Standard Time)

For anyone who stumbles upon this:

App was running into this issue because FunctionDataFetcher.runSuspendingFunction is changed to run suspended functions in CoroutineScope. In the previous version, it was running them in GlobalScope.
Our app calls async elastic search apis and uses listeners. In CoroutineScope, coroutines are exiting before closing ES connection.

Dariusz Kuc · Answer 3 · Sat Nov 18 2023 05:12:41 GMT+0800 (China Standard Time)

Hello 👋
CoroutineScope is created per GraphQL request* so all the coroutines are bounded in scope to to the lifecycle of the request. GlobalScope is bounded only to an app lifecycle so it was possible to have runaway coroutines that were still executing even though the request already finished.

If your coroutines are exiting before closing the ES connection it means that you are not correctly managing their scopes and probably should use custom scope to launch your long-lived ES workers.

*you can customize the context through the corresponding factory