spring-projects / spring-data-mongodb

Provides support to increase developer productivity in Java when using MongoDB. Uses familiar Spring concepts such as a template classes for core API usage and lightweight repository style data access.

Home Page:https://spring.io/projects/spring-data-mongodb/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Misleading Change Streams documentation

michaldo opened this issue · comments

Change Streams can be consumed with both, the imperative and the reactive MongoDB Java driver. It is highly recommended to use the reactive variant, as it is less resource-intensive.

src/main/antora/modules/ROOT/pages/mongodb/change-streams.adoc

I do not understand why reactive variant is highly recommended. I follow example give in documentation part interactive, and I see one thread is started and listen changes on Mongo socket.
Is one extra thread resource intensive?
What about virtual threads?

If I use standard Mongo client in application and I want use Change Stream, should I
a) rewrite application
b) use 2 clients in single application
c) choose not recommended variant
?

IMHO documentation preface presents false statement or rough opinion without justification/explanation. The preface is misleading and may cause wrong developers decision.

I propose remove the sentence at all or clarify resource utilization subject

Reactive Change Streams start their activity from a push-based I/O mechanism while imperative Change Streams (also, tailable cursors) require constant polling. Polling is a repetitive activity that consumes computation time while there might be no data at all. Additionally, each imperative change stream requires one thread resulting in a linear relationship between the number of change streams being consumed vs. CPU and memory cost.

Reactive Change streams are constrained in the number of their I/O threads to roughly the number of CPU cores regardless the number of concurrent change streams being consumed.

We introduced change stream and tailable cursor support in 2018. I think in late 2018, the JDK team started considering lightweight threads if I'm not mistaken.
I think it is worth exploring how Virtual Threads behave if you enable these on your Executor, let us know how this goes.

Thanks Mark for answer. I was focused on thread issue and I miss poll issue.

My findings:

  1. Both standard and reactive solution capture changes by calling one per second query: DEBUG o.m.driver.protocol.command - Command "getMore" started...
    At this point my reactor skill ends.
    For standard client, when I used platform thread, the thread dump is:
"my-platform-cdc" #58 [34188] prio=5 os_prio=0 cpu=0.00ms elapsed=4.93s tid=0x000001e6153831b0 nid=34188 runnable  
[0x0000003f225fe000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.Net.poll(java.base@21.0.3/Native Method)
	at sun.nio.ch.NioSocketImpl.park(java.base@21.0.3/NioSocketImpl.java:191)
	at sun.nio.ch.NioSocketImpl.park(java.base@21.0.3/NioSocketImpl.java:201)
	at sun.nio.ch.NioSocketImpl.implRead(java.base@21.0.3/NioSocketImpl.java:309)
	at sun.nio.ch.NioSocketImpl.read(java.base@21.0.3/NioSocketImpl.java:346)
	at sun.nio.ch.NioSocketImpl$1.read(java.base@21.0.3/NioSocketImpl.java:796)
	at java.net.Socket$SocketInputStream.read(java.base@21.0.3/Socket.java:1109)
	at com.mongodb.internal.connection.SocketStream.read(SocketStream.java:176)

When I use virtual thread, thred dump is:

#56 "my-virt-cdc" virtual
      java.base/java.lang.VirtualThread.park(VirtualThread.java:582)
      java.base/java.lang.System$2.parkVirtualThread(System.java:2643)
      java.base/jdk.internal.misc.VirtualThreads.park(VirtualThreads.java:54)
      java.base/java.util.concurrent.locks.LockSupport.park(LockSupport.java:369)
      java.base/sun.nio.ch.Poller.pollIndirect(Poller.java:139)
      java.base/sun.nio.ch.Poller.poll(Poller.java:102)
      java.base/sun.nio.ch.Poller.poll(Poller.java:87)
      java.base/sun.nio.ch.NioSocketImpl.park(NioSocketImpl.java:175)
      java.base/sun.nio.ch.NioSocketImpl.park(NioSocketImpl.java:201)
      java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:309)
      java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:346)
      java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:796)
      java.base/java.net.Socket$SocketInputStream.read(Socket.java:1109)
      com.mongodb.internal.connection.SocketStream.read(SocketStream.java:176)

My conclusion is:

  1. Both standard and reactive are effectively poll solution, because Command "getMore" is executed one per second.
  2. I guess it is not good that platform thread is RUNNABLE - otherwise you will not prefer reactive in the documentation preface.
  3. I guess it is good that virtual thread is parked (java.lang.VirtualThread.park(VirtualThread.java:582))

As for now, the issue isn't actionable and therefore I'm closing it.