crate / crate-jdbc

A JDBC driver for CrateDB.

Home Page:https://crate.io/docs/jdbc/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

When multiple crate nodes are listed in the connection string, only one is actually used

reversefold opened this issue · comments

When checking the monitoring for our crate clusters (v0.49.4) I have noticed that only one of the crate nodes listed in the jdbc connection string is actually being utilized. We have nodes which only act as clients (no master and no data) and the heap usage for one of them shows lots of usage and GC and the others show a nearly constant level of usage. I suspect that only the first client listed is used unless it is down, then the next one is used. I would expect the jdbc driver to round-robin in order to spread the load across nodes.
screenshot 2015-09-30 08 54 00

The jdbc driver client actually works correctly. It delegates a list of hosts from the connection string and queries to the internal crate client which spreads the load across nodes by randomly choosing a node from a cluster state. There might be another issue. Could you force garbage collection on the node which shows the high heap usage and report the outcome?

I doubt that forcing a GC would make any difference. The graph above shows the stereotypical sawtooth pattern of memory usage and large GC, but only one one of the nodes. All of the others have nearly static memory usage. The one is the graph of a JVM doing work, the other is a JVM just idling.

I'm not sure from your description that the driver is doing the correct thing. It sounds like you're saying that the jdbc driver connects to a client host and lets that client and the crate cluster farm out the query as needed. This is fine if you only have 1 client node but we have 4 in this case. If the jdbc driver always sends its requests to one client then the other clients never do anything, as they have no data on them.

Another important point is that this is with crate 0.49.4.

Looking through the code myself, assuming all is working properly, TransportClientNodesService appears to loop through the nodes it has in its local list sequentially each time it is executing a statement. However, all of the sampler and retry code below is too complicated for me to verify that it is doing what I expect at a glance.

@reversefold, why we're asking to trigger GC manually and see what's happening ist that it might be possible that the jdbc client has a memory leak and we're trying to look at this. the random selection of clients seems to work correctly. i think you can do it via commandline using JMX. i don't know how by heart, but could look it up if that helps.

So you're asking for GC on the node using jdbc? The graph attached above is for the crate client nodes, not the nodes connecting via jdbc. It's not the memory usage of the jdbc driver itself I'm pointing out here, it's the heap usage of the crate client nodes.

All of the monitoring for the client nodes backs up this conclusion, by the way. CPU usage, thread count, and network usage all show the same pattern, one node is much higher than all the rest and the others basically stay static at a steady/idle, state.

we were asking for gc on the node that shows that behaviour. and just being curious, if you modify the connection string (e.g. removing that node from the connection string) another machine shows the same behaviour?

Well, I feel sheepish. I looked again at our configuration and which nodes were showing the increased heap usage. I'm not sure if we shared this explicitly before but we are separating our client nodes into 3 "pools", one for "real-time" read usage, one for writing, and one for scheduled reporting. According to one set of configuration (the one I was looking at when I opened this issue) the 3 nodes with high usage are spread across all 3 "pools", hence my conclusion. Looking at a second set of configuration, though, which is the one that our systems actually use when configuring the clients, the 3 high usage nodes are the read nodes, which get the most queries and traffic.

In other words, I was wrong and your conclusions are validated. Thanks much for looking through this with me.

phew :) good news. thanks