reactor / reactor-core

Non-Blocking Reactive Foundation for the JVM

Home Page:http://projectreactor.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TimedScheduler leaks cancelled pending task samples

nathankooij opened this issue · comments

In #3642 an issue was reported & solved where pending tasks get stopped whenever the task's scheduling gets rejected. In addition to that we noticed that in one of our applications there exists another case where pending task samples can pile up, whenever the scheduled task gets cancelled before it runs. This is quite a common case for users of the timeout operator, which submits a delayed task representing the timeout. However, if the reactive chain completes before the timeout is hit, the delayed task is cancelled. This is not reflected in the pending sample being stopped, and thus frequent usages of the timeout operator on an instrumented scheduler can easily pile up.

Expected Behavior

Pending task samples for the TimedScheduler should get stopped when the task is cancelled/disposed.

Actual Behavior

The pending task sample is never stopped, and thus a memory leak is created.

Steps to Reproduce

	@Test
	void pendingTaskRemovedOnCancellation() {
		ExecutorService executorService = Executors.newSingleThreadScheduledExecutor();
		Scheduler original = Schedulers.fromExecutorService(executorService);
		MeterRegistry registry = new SimpleMeterRegistry();
		TimedScheduler testScheduler = new TimedScheduler(original, registry, "test", Tags.empty());
		testScheduler.init();

		RequiredSearch requiredSearch = registry.get("test.scheduler.tasks.pending");
		LongTaskTimer longTaskTimer = requiredSearch.longTaskTimer();

		try {
			// Schedule a task far in the future.
			Disposable.Swap waitingTask = Disposables.swap();
			assertThatNoException().isThrownBy(() -> waitingTask.update(testScheduler.schedule(() -> {}, 10_000, TimeUnit.SECONDS)));

			// It's pending to be scheduled.
			assertThat(longTaskTimer.activeTasks()).as("active pending")
					.isOne();

			// E.g. a `Mono#timeout` was never hit, so the task gets disposed.
			waitingTask.dispose();

			// The task should no longer be considered pending, as it was disposed.
			assertThat(longTaskTimer.activeTasks()).as("active pending")
					.isZero();
		} finally {
			testScheduler.disposeGracefully().block(Duration.ofSeconds(1));
		}
	}

The final assertion fails with the current implementation.

Possible Solution

Wrapping the Disposable returned from the scheduling and having that also stop the sample could work, something like (kept short here, should be applied to all code paths):

diff --git a/reactor-core-micrometer/src/main/java/reactor/core/observability/micrometer/TimedScheduler.java b/reactor-core-micrometer/src/main/java/reactor/core/observability/micrometer/TimedScheduler.java
index 83b261ef4..f588b7e23 100644
--- a/reactor-core-micrometer/src/main/java/reactor/core/observability/micrometer/TimedScheduler.java
+++ b/reactor-core-micrometer/src/main/java/reactor/core/observability/micrometer/TimedScheduler.java
@@ -105,7 +105,7 @@ final class TimedScheduler implements Scheduler {
 		TimedRunnable timedTask = wrap(task);
 
 		try {
-			return delegate.schedule(timedTask, delay, unit);
+			return new TimedDisposable(delegate.schedule(timedTask, delay, unit), timedTask.pendingSample::stop);
 		}
 		catch (RejectedExecutionException exception) {
 			timedTask.pendingSample.stop();
@@ -199,6 +199,22 @@ final class TimedScheduler implements Scheduler {
 		}
 	}
 
+	static final class TimedDisposable implements Disposable {
+		final Disposable disposable;
+		final Runnable stopSample;
+
+		public TimedDisposable(Disposable disposable, Runnable stopSample) {
+			this.disposable = disposable;
+			this.stopSample = stopSample;
+		}
+
+		@Override
+		public void dispose() {
+			disposable.dispose();
+			stopSample.run();
+		}
+	}
+
 	static final class TimedRunnable implements Runnable {
 
 		final MeterRegistry registry;

Happy to file a PR if we agree on this solution.

Your Environment

  • Reactor version(s) used: 3.6.2
  • JVM version (java -version): OpenJDK 64-Bit Server VM GraalVM CE 22.3.0 (build 17.0.5+8-jvmci-22.3-b08, mixed mode, sharing)
  • OS and version (eg uname -a): Darwin MacBook-Pro-9.local 22.6.0 Darwin Kernel Version 22.6.0: Wed Jul 5 22:21:56 PDT 2023; root:xnu-8796.141.3~6/RELEASE_X86_64 x86_64

Hi, thank you for the report. I see the shortcoming of the current design. I suppose there is not much way to avoid creating a new wrapper like you suggest. It would be great to avoid an object instantiation, but I can't see a different way to achieve this that wouldn't involve changing the Scheduler implementations. Please consider a PR and add appropriate test, that include validating what happens if an already executed task is disposed, or multiple dispose calls are issued.

Thanks!

Hi, thank you for the report. I see the shortcoming of the current design. I suppose there is not much way to avoid creating a new wrapper like you suggest. It would be great to avoid an object instantiation, but I can't see a different way to achieve this that wouldn't involve changing the Scheduler implementations. Please consider a PR and add appropriate test, that include validating what happens if an already executed task is disposed, or multiple dispose calls are issued.

Thanks!

Thanks for getting back to me. I hope to find some time this week to open a PR with the proposed approach. 👍

@nathankooij are you still interested in contributing?

@chemicL it might be possible to make create a little less objects (yes certain lambda invocations will most likely create objects, but not all).

diff --git a/reactor-core-micrometer/src/main/java/reactor/core/observability/micrometer/TimedScheduler.java b/reactor-core-micrometer/src/main/java/reactor/core/observability/micrometer/TimedScheduler.java
--- a/reactor-core-micrometer/src/main/java/reactor/core/observability/micrometer/TimedScheduler.java	(revision 0a05defa56e6a18bb43a896f6aba1f360dd7a2f4)
+++ b/reactor-core-micrometer/src/main/java/reactor/core/observability/micrometer/TimedScheduler.java	(date 1712161697491)
@@ -18,6 +18,7 @@
 
 import java.util.concurrent.RejectedExecutionException;
 import java.util.concurrent.TimeUnit;
+import java.util.function.Function;
 
 import io.micrometer.core.instrument.Counter;
 import io.micrometer.core.instrument.LongTaskTimer;
@@ -91,10 +92,10 @@
 		TimedRunnable timedTask = wrap(task);
 
 		try {
-			return delegate.schedule(timedTask);
+			return timedTask.schedule(delegate::schedule);
 		}
 		catch (RejectedExecutionException exception) {
-			timedTask.pendingSample.stop();
+			timedTask.dispose();
 			throw exception;
 		}
 	}
@@ -105,10 +106,10 @@
 		TimedRunnable timedTask = wrap(task);
 
 		try {
-			return delegate.schedule(timedTask, delay, unit);
+			return timedTask.schedule(r -> delegate.schedule(r, delay, unit));
 		}
 		catch (RejectedExecutionException exception) {
-			timedTask.pendingSample.stop();
+			timedTask.dispose();
 			throw exception;
 		}
 	}
@@ -116,7 +117,7 @@
 	@Override
 	public Disposable schedulePeriodically(Runnable task, long initialDelay, long period, TimeUnit unit) {
 		this.submittedPeriodicInitial.increment();
-		return delegate.schedulePeriodically(wrapPeriodic(task), initialDelay, period, unit);
+		return wrapPeriodic(task).schedule(r -> delegate.schedulePeriodically(r, initialDelay, period, unit));
 	}
 
 	@Override
@@ -170,10 +171,10 @@
 			TimedRunnable timedTask = parent.wrap(task);
 
 			try {
-				return delegate.schedule(timedTask);
+				return timedTask.schedule(delegate::schedule);
 			}
 			catch (RejectedExecutionException exception) {
-				timedTask.pendingSample.stop();
+				timedTask.dispose();
 				throw exception;
 			}
 		}
@@ -184,10 +185,10 @@
 			TimedRunnable timedTask = parent.wrap(task);
 
 			try {
-				return delegate.schedule(timedTask, delay, unit);
+				return timedTask.schedule(r -> delegate.schedule(r, delay, unit));
 			}
 			catch (RejectedExecutionException exception) {
-				timedTask.pendingSample.stop();
+				timedTask.dispose();
 				throw exception;
 			}
 		}
@@ -195,18 +196,22 @@
 		@Override
 		public Disposable schedulePeriodically(Runnable task, long initialDelay, long period, TimeUnit unit) {
 			parent.submittedPeriodicInitial.increment();
-			return delegate.schedulePeriodically(parent.wrapPeriodic(task), initialDelay, period, unit);
+			return parent.wrapPeriodic(task)
+					.schedule(r -> delegate.schedulePeriodically(r, initialDelay, period, unit));
 		}
 	}
 
-	static final class TimedRunnable implements Runnable {
+	static final class TimedRunnable implements Runnable, Disposable {
 
 		final MeterRegistry registry;
+
 		final TimedScheduler parent;
 		final Runnable task;
 
 		final LongTaskTimer.Sample pendingSample;
 
+		Disposable disposable;
+
 		boolean isRerun;
 
 		TimedRunnable(MeterRegistry registry, TimedScheduler parent, Runnable task) {
@@ -245,5 +250,22 @@
 			Runnable completionTrackingTask = parent.completedTasks.wrap(this.task);
 			this.parent.activeTasks.record(completionTrackingTask);
 		}
+
+		public Disposable schedule(Function<Runnable, Disposable> scheduler) {
+			this.disposable = scheduler.apply(this);
+
+			return this;
+		}
+
+		@Override
+		public void dispose() {
+			if (this.disposable != null) {
+				this.disposable.dispose();
+			}
+
+			if (this.pendingSample != null) {
+				this.pendingSample.stop();
+			}
+		}
 	}
 }

Albeit it becomes quite convoluted code wise.
The difficulty is the lack of an interface between Worker and Scheduler and the bi-directional reference between the task that needs to be submitted is the one needed to be disposed as well.

@chemicL going away from the previously mentioned approach i've created a PR to address the issue with some modifications to not introduce extra objects on the stack but rather delegate the scheduling to the task itself, hereby also moving the duplicate logic shared between the Worker and Scheduler implementation on counters into the TimedRunnable.

Thanks, indeed the solution looks neat. Let's get this merged 🎉