amphp / cluster

Building multi-core network applications with PHP.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Restart Workers

kelunik opened this issue · comments

We have to implement worker restarts and graceful reloading, preferably with zero downtime.

Proposal for graceful restart:

  • Watcher spawns new workers, old workers are tagged as "shutting down" by moving to a new collection like "shuttingDownWorkers"
  • Watcher sends "shut down" message to old workers
  • Worker reacts to "shut down" message, initiates graceful shutdown and simply stops when they are done
  • Watcher kills a worker after $gracefulShutdownTimeout

Graceful shutdown:
Same as graceful restart except that watcher does not spawn new workers

We could also shutdown one after the other and wait for a replacement worker to be up before shutting down the next worker. This gives the benefit of being able to give clearly defined worker IDs to the workers. The disadvantage is that there might be inconsistencies for a longer period of time, but we could also delay all listen requests until all new workers are up, but then there's no zero downtime anymore.

I would not do this. What if shutting down a worker takes 10s (ending longer running requests gracefully). If you have 30 workers that could take 300s. Doing that in parallel would take 10s.
My proposal also has no downtime.

Both strategy have pro / con, would it be possible to have both of them to let end user choose depending on its application / use case ?

This has been implemented - both automatic restarts of workers and explicit restarting of the cluster.

If a worker dies unexpectedly, the Watcher instance automatically starts a new process to replace it. There are two examples demonstrating this: failing-process.php calls exit in the worker and out-of-memory.php concatenates huge strings until running out of memory.

The entire cluster can be restarted using Watcher::restart() or sending SIGUSR1 to bin/cluster. The first use I can think of for this feature is reloading code changes without shutting down the main watcher process. As a thought, a filesystem monitor could be coupled with this to automatically restart a worker when files in a project change.

Please give this package a try. I'd appreciate feedback and bug reports.