7mind / izumi

Productivity-oriented collection of lightweight fancy stuff for Scala toolchain

Home Page:https://izumi.7mind.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The only RoleTask does not exit after completion

amricko0b opened this issue · comments

Greetings!

I've got a Cats IO based application with only RoleTask bootstrapped. Distage version: 1.0.8
My RoleTask contains long, blocking piece of code - but it doesn't blocks forever.

I thought, that after completion of role's start() method the application must exit - but it doesn't.
Am I wrong, assuming this behaviour?

My role looks something like this:

class MyTaskRole(
    sender: MySender[IO],
    @Id("myTask-role-logger") logger: Logger[IO]
) extends RoleTask[IO] {

  override def start(
      roleParameters: RawEntrypointParams,
      freeArgs: Vector[String]
  ): IO[Unit] =
    for {
      _ <- logger.info("Running sending task")
      _ <- sender.sendBatchesUntilEmpty
      _ <- logger.info("Sending task complete!")
    } yield ()
}

object MyTaskRole extends RoleDescriptor {
  override def id: String = "myTaskRole"

  object Plugin extends PluginDef {
    include(
      new RoleModuleDef {
        makeRole[MyTaskRole]
      }
    )
  }
}

sender.sendBatchesUntilEmpty is a blocking call, performing a long, blocking task.
After its completion I see "Sending task complete!" message among logs.
Also, I observe that line:

I 2021-08-04T17:48:26.407 (RoleAppEntrypoint.scala:82) …AppEntrypoint.Impl.runRoles [31:ioapp-compute-0] phase=late No services to run, exiting...

But, as I have already mentioned, application does not exit.

Here is example of my launcher - maybe it helps somehow.
I had not override anything here.

object Launcher extends LauncherCats[IO] {

  implicit val ec: ExecutionContextExecutor =
    ExecutionContext.fromExecutor(
      Executors.newFixedThreadPool(10)
    )

  implicit val contextShiftIO: ContextShift[IO] = IO.contextShift(ec)

  implicit val concurrentEffectIO: ConcurrentEffect[IO] = IO.ioConcurrentEffect

  implicit val timerIO: Timer[IO] = IO.timer(ec)

  override protected def pluginConfig: PluginConfig =
    PluginConfig.const(
      Seq(
        MyTaskRole.Plugin,
        DI
      )
    )

  object DI extends PluginDef {
    // a bunch of include(...) directives
  }
}

Sounds like a bug. I'll look at it. Thanks.

I've tried to reproduce this in 634aea5 though the demo app works exactly as expected.

Could you check if your app works on latest snapshot? In case it doesn't - could you fix my repro and actually reproduce the issue please?

Many thanks for your attention, @pshirshov!

I have checked your reproduction, it works as expected for me as well.
Though, switching my app to 1.0.9-SNAPSHOT did not help, it led me to some sort of explanation.

You see, in your reproduction Task has no dependencies.
But actual task in my app has (that's my fault - I've not mentioned that fact, sorry) MySender which, in his own turn, has KafkaProducer (from fs2-kafka) as a dependency, that is utilized for actual sending. And this producer seems to be to blame.

So, the complete example Task looks like this. Here I use KafkaProducer directly for clarity:

class MyTaskRole(
    producer: KafkaProducer[IO, String, String]
    @Id("myTask-role-logger") logger: Logger[IO]
) extends RoleTask[IO] {

  override def start(
      roleParameters: RawEntrypointParams,
      freeArgs: Vector[String]
  ): IO[Unit] =
    for {
      _ <- logger.info("Running sending task")
      // _ <- producer.produce(...) we can skip this call at all, it does not change the behaviour
      _ <- logger.info("Sending task complete!")
    } yield ()
}

Producer declared as follows:

object DI extends PluginDef {
    // ...

    include(
      new ModuleDef {
        make[KafkaProducer[IO, String, String]].fromResource {
          KafkaProducer[IO].resource(
            ProducerSettings(
              keySerializer = Serializer.string[IO],
              valueSerializer = Serializer.string[IO]
            ).withBootstrapServers("localhost:9092")
              .withCloseTimeout(2 seconds)
          )
        }
   )

   // ...
}

If I exclude kafka producer from Task dependencies (just remove field), it works as expected - role exits after completion.

Seems like the problem is hidden in KafkaProducer's resource (distage's .fromResource directive works fine for other long-living resources) or in my misunderstanding of how to use it.
So, it could be a bug, but it's not related with Distage (probably with fs2-kafka, but I'm not sure - I'l proceed my investigation).

Hence, I think, this issue can be closed for now.
Thanks for your help and sorry again for incomplete details.

Could you create a separate demo project with a repro please? In any case this shouldn't happen, though we would be happy to investigate. In case it's a bug in fs2 or cats - we'll fix it anyway :)

Sorry for that delay - I had to have some time dedicated to work :(

I did a little demo project with reproduction.
It could be found here.

I should also notice, that the problem disappears, when I switch from custom ExecutionContext (based on FIxedThreadPool) to global.

commented

@amricko0b
Thanks for the reproduction!
The problem in this demo is caused by creating a global – and non-daemon, so it prevents the VM from closing – execution context which is then never closed.

Solution: https://github.com/amricko0b/distage-1557/pull/1
In detail, this causes the VM to stuck and is generally preventable by either making thread pools daemonic using a custom ThreadFactory:

import scala.util.chaining.scalaUtilChainingOps

Executors.newFixedThreadPool(16, new ThreadFactory {
  def newThread(r: Runnable): Thread = new Thread(r).tap(_.setDaemon(true))
})

Or by closing the ExecutionContext properly using its .shutdown() method. Distage provides a Lifecycle.fromExecutorService helper for that:

make[ExecutionContext].fromResource {
  Lifecycle
    .fromExecutorService(Executors.newFixedThreadPool(16))
    .map(ExecutionContext.fromExecutor(_))
}

Also I think this problem be avoided in the future with this change #1585 - we already supply default bindings for ExecutionContext, but only for ZIO right now, I fixed that oversight in the PR so you may use a default ExecutionContext @Id("cpu") without having to make a custom one.

Oh, now I understand. Thanks, for explanation.
My investigation took me very deep in details, but the answer was much more simple :D

By the way, #1585 - great idea, will wait for that feature, but for now I'll try the second option with helper)

I suppose this issue can be closed now.
Again, thanks for your answers and patience)