sunng87 / ring-jetty9-adapter

An enhanced version of jetty adapter for ring, with additional features like websockets, http/2 and http/3

c10k problem in jdk 21 with virtual threads.

jasonjckn opened this issue · comments

I'd thought I'd post here because there was a previous thread here on loom, feel free to close if off topic.
this is probably not an issue with ring-jetty9-adapter, the issue is likely further down the stack.

Doing a simple benchmark, my app can't handle 10k concurrent requests with virtual threads, i get very poor throughput, and mostly notably socket errors. I've attached a minimum viable code to reproduce it.

I'm using jetty v11, i'd be very curious to see how jetty v12 handles it.
running JDK 21 (ea) , tried both zulu and oracle on arm64 macos, (also tested on linux, which has poor throughput, but no socket errors)

        info.sunng/ring-jetty9-adapter                         {:mvn/version "0.22.1"}
         org.eclipse.jetty/jetty-server 11.0.15

Please see attachment for source code.

The tests are ran using 'hey' and 'wrk'

\w 1k concurrent

 ⚡ wrk --latency --timeout 1m -d 2s -c 1000 -t 1000 'http://localhost:3000/api/1.0/admin/uptest'
Running 2s test @ http://localhost:3000/api/1.0/admin/uptest
  1000 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.00s     3.29ms   1.02s    64.88%
    Req/Sec     0.04      0.20     1.00     95.70%
  Latency Distribution
     50%    1.00s
     75%    1.01s
     90%    1.01s
     99%    1.01s
  1999 requests in 2.10s, 322.10KB read
Requests/sec:    949.66
Transfer/sec:    153.02KB

\w 10k concurrent

 ⚡ wrk --latency --timeout 1m -d 2s -c 10000 -t 1000 'http://localhost:3000/api/1.0/admin/uptest'
Running 2s test @ http://localhost:3000/api/1.0/admin/uptest
  1000 threads and 10000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.41s     1.80ms   1.41s    65.54%
    Req/Sec     7.59      6.04    30.00     88.24%
  Latency Distribution
     50%    1.41s
     75%    1.41s
     90%    1.41s
     99%    1.41s
  148 requests in 2.11s, 23.85KB read
  Socket errors: connect 0, read 4982, write 0, timeout 0
Requests/sec:     70.00
Transfer/sec:     11.28KB

Comparing 1k vs 10k: 10k has more socket errors, less throughput, and less total requests replied to.

if I use 'hey' , similar kind of stats, and it prints out a ton of

   [1]   Get "http://localhost:3000/api/1.0/admin/uptest": read tcp [::1]:62673->[::1]:3000: read: connection reset by peer
  [1]   Get "http://localhost:3000/api/1.0/admin/uptest": read tcp [::1]:62675->[::1]:3000: read: connection reset by peer
  [1]   Get "http://localhost:3000/api/1.0/admin/uptest": read tcp [::1]:62678->[::1]:3000: read: connection reset by peer
  [1]   Get "http://localhost:3000/api/1.0/admin/uptest": read tcp [::1]:62680->[::1]:3000: read: connection reset by peer

It could relate to this. Would it be easy for you to start the JVM with -Djdk.tracePinnedThreads=full and look for the relevant console output? Moreoveor, could you try with regular OS threads, and see if things improve (or not)...

Hmm... it could also have to do with your OS - see this.

@jimpil Thanks for the quick reply,

If none of your ideas solve it, I'm thinking # of selectors and # acceptors might be the issue too, since they default to 1.

Hoping to get some more time this week to test these ideas out, thanks!

@jimpil update... on my experiments

I tried -Djdk.tracePinnedThreads=full
Zero output from this... so I guess that's good.

I tried sudo sysctl -w net.inet.ip.portrange.first=32768
No difference

I tried various numbers of selector & acceptor threads
No difference

I also tried forcing mandatory C2 compilation for all bytecode, (because I was seeing a lot of time spent compiling during profiling).
No difference

As for
This is a possible culprit, since the ring-jetty9-adapter calls enumeration-seq, et al, which is synchronized. Wouldn't be my first guess though.

The last think I want to try is Jetty v12, and removing synchronized, but otherwise I'm a bit stumped.

If -Djdk.tracePinnedThreads=full didn't print out anything suspicious, then you can ignore CLJ-2771 (for your tests at least). I don't want to send you down the wrong path, but to me this sounds like an OS issue (hitting some sort of file descriptor or port limit).