meglory / supervisor_checks

Framework to build health checks for Supervisor-based services.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Supervisor Health Checks

Framework to build health checks for Supervisor-based services.

Health check programs are supposed to run as event listeners in Supervisor environment. On check failure Supervisor will attempt to restart monitored process.

Here's typical configuration example:

command=python <path_to_supervisor_check_program>
stderr_logfile = /var/log/supervisor/supervisor_example_check-stderr.log
stdout_logfile = /var/log/supervisor/supervisor_example_check-stdout.log

Here's the list of check programs package provides out-of-box:

  • supervisor_http_check: process check based on HTTP query.
  • supervisor_tcp_check: process check based on TCP connection status.
  • supervisor_xmlrpc_check: process check based on call to XML-RPC server.
  • supervisor_memory_check: process check based on amount of memory consumed by process.
  • supervisor_cpu_check: process check based on CPU percent usage within time interval.
  • supervisor_complex_check: complex check (run multiple checks at once).

For now, it is developed and supposed to work primarily with Python 3 and Supervisor 4 branch. There's nominal Python 2.x support but it's not tested.


Install and update using pip

pip install supervisor_checks

Developing Custom Check Modules

While framework provides the good set of ready-for-use health check classes, it can be easily extended by adding application-specific custom health checks.

To implement custom check class, check_modules.base.BaseCheck class must be inherited:

    class BaseCheck(object):
        """Base class for checks.

        NAME = None

        def __call__(self, process_spec):
            """Run single check.

            :param dict process_spec: process specification dictionary as returned
                   by SupervisorD API.

            :return: True is check succeeded, otherwise False. If check failed -
                     monitored process will be automatically restarted.

            :rtype: bool

        def _validate_config(self):
            """Method may be implemented in subclasses. Should return None or
            raise InvalidCheckConfig in case if configuration is invalid.

            Here's typical example of parameter check:

              if 'url' not in self._config:
                  raise errors.InvalidCheckConfig(
                      'Required `url` parameter is missing in %s check config.' % (

Here's the example of adding custom check:

    from supervisor_checks.check_modules import base
    from supervisor_checks import check_runner

    class ExampleCheck(base.BaseCheck):

        NAME = 'example'

        def __call__(self, process_spec):

            # Always return True
            return True

    if __name__ == '__main__':

            'example_check', 'some_process_group', [(ExampleCheck, {})]).run()

Out-of-box checks

HTTP Check

Process check based on HTTP query.


$ /usr/local/bin/supervisor_http_check -h
usage: supervisor_http_check [-h] -n CHECK_NAME -g PROCESS_GROUP -u URL -p
                             PORT [-t TIMEOUT] [-r NUM_RETRIES]

Run HTTP check program.

optional arguments:
  -h, --help            show this help message and exit
  -n CHECK_NAME, --check-name CHECK_NAME
                        Health check name.
  -g PROCESS_GROUP, --process-group PROCESS_GROUP
                        Supervisor process group name.
  -N PROCESS_NAME, --process-name PROCESS_NAME
                        Supervisor process name. Process group argument is
                        ignored if this is passed in
  -u URL, --url URL     HTTP check url
  -m METHOD, --method METHOD
                        HTTP request method (GET, POST, PUT...)
  -j JSON, --json JSON  HTTP json body, auto sets content-type header to
  -b BODY, --body BODY  HTTP body, will be ignored if json body pass in
  -H HEADERS, --headers HEADERS
                        HTTP headers as json
  -U USERNAME, --username USERNAME
                        HTTP check username
  -P PASSWORD, --password PASSWORD
                        HTTP check password
  -p PORT, --port PORT  HTTP port to query. Can be integer or regular
                        expression which will be used to extract port from a
                        process name.
  -t TIMEOUT, --timeout TIMEOUT
                        Connection timeout. Default: 15
  -r NUM_RETRIES, --num-retries NUM_RETRIES
                        Connection retries. Default: 2

Configuration Examples

Query process running on port 8080 using URL /ping:

command=/usr/local/bin/supervisor_http_check -g example_service -n example_check -u /ping -t 30 -r 3 -p 8080

Query process group using URL /ping. Each process is listening on it's own port. Each process name is formed as some-process-name_port so particular port number can be extracted using regular expression:

command=/usr/local/bin/supervisor_http_check -g example_service -n example_check -u /ping -t 30 -r 3 -p ".+_(\\d+)"

TCP Check

Process check based on TCP connection status.


$ /usr/local/bin/supervisor_tcp_check -h
usage: supervisor_tcp_check [-h] -n CHECK_NAME -g PROCESS_GROUP -p PORT
                            [-t TIMEOUT] [-r NUM_RETRIES]

Run TCP check program.

optional arguments:
  -h, --help            show this help message and exit
  -n CHECK_NAME, --check-name CHECK_NAME
                        Check name.
  -g PROCESS_GROUP, --process-group PROCESS_GROUP
                        Supervisor process group name.
  -N PROCESS_NAME, --process-name PROCESS_NAME
                        Supervisor process name. Process group argument is
                        ignored if this is passed in
  -p PORT, --port PORT  TCP port to query. Can be integer or regular
                        expression which will be used to extract port from a
                        process name.
  -t TIMEOUT, --timeout TIMEOUT
                        Connection timeout. Default: 15
  -r NUM_RETRIES, --num-retries NUM_RETRIES
                        Connection retries. Default: 2

Configuration Examples

Connect to process running on port 8080:

command=/usr/local/bin/supervisor_tcp_check -g example_service -n example_check -t 30 -r 3 -p 8080

Query process group when each process is listening on it's own port. Each process name is formed as some-process-name_port so particular port number can be extracted using regular expression:

command=/usr/local/bin/supervisor_tcp_check -g example_service -n example_check -t 30 -r 3 -p ".+_(\\d+)"


Process check based on call to XML-RPC server.


$ /usr/local/bin/supervisor_xmlrpc_check -h
usage: supervisor_xmlrpc_check [-h] -n CHECK_NAME -g PROCESS_GROUP [-u URL]
                               [-s SOCK_PATH] [-S SOCK_DIR] [-p PORT]
                               [-r NUM_RETRIES]

Run XML RPC check program.

optional arguments:
  -h, --help            show this help message and exit
  -n CHECK_NAME, --check-name CHECK_NAME
                        Health check name.
  -g PROCESS_GROUP, --process-group PROCESS_GROUP
                        Supervisor process group name.
  -N PROCESS_NAME, --process-name PROCESS_NAME
                        Supervisor process name. Process group argument is
                        ignored if this is passed in
  -u URL, --url URL     XML RPC check url
  -s SOCK_PATH, --socket-path SOCK_PATH
                        Full path to XML RPC server local socket
  -S SOCK_DIR, --socket-dir SOCK_DIR
                        Path to XML RPC server socket directory. Socket name
                        will be constructed using process name:
  -m METHOD, --method METHOD
                        XML RPC method name. Default is status
  -p PORT, --port PORT  Port to query. Can be integer or regular
                        expression which will be used to extract port from a
                        process name.
  -r NUM_RETRIES, --num-retries NUM_RETRIES
                        Connection retries. Default: 2

Configuration Examples

Call to process' XML-RPC server listening on port 8080, URL /status, RPC method get_status:

command=/usr/local/bin/supervisor_xmlrpc_check -g example_service -n example_check -r 3 -p 8080 -u /status -m get_status

Call to process' XML-RPC server listening on UNIX socket:

command=/usr/local/bin/supervisor_xmlrpc_check -g example_service -n example_check -r 3 -s /var/run/example.sock -m get_status

Call to process group XML-RPC servers, listening on different UNIX socket. In such case socket directory must be specified, process socket name will be formed as <process_name>.sock:

command=/usr/local/bin/supervisor_xmlrpc_check -g example_service -n example_check -r 3 -S /var/run/ -m get_status

Memory Check

Process check based on amount of memory consumed by process.


$ /usr/local/bin/supervisor_memory_check -h
usage: supervisor_memory_check [-h] -n CHECK_NAME -g PROCESS_GROUP -m MAX_RSS
                               [-c CUMULATIVE]

Run memory check program.

optional arguments:
  -h, --help            show this help message and exit
  -n CHECK_NAME, --check-name CHECK_NAME
                        Health check name.
  -g PROCESS_GROUP, --process-group PROCESS_GROUP
                        Supervisor process group name.
  -N PROCESS_NAME, --process-name PROCESS_NAME
                        Supervisor process name. Process group argument is
                        ignored if this is passed in
  -m MAX_RSS, --msx-rss MAX_RSS
                        Maximum memory allowed to use by process, KB.
  -c CUMULATIVE, --cumulative CUMULATIVE
                        Recursively calculate memory used by all process

Configuration Examples

Restart process if the total amount of memory consumed by process and all its children is greater than 100M:

command=/usr/local/bin/supervisor_memory_check -n example_check -m 102400 -c -g example_service

CPU Check

Process check based on CPU percent usage within specified time interval.


$ /usr/local/bin/supervisor_cpu_check -h
usage: supervisor_cpu_check [-h] -n CHECK_NAME -g PROCESS_GROUP -p MAX_CPU -i INTERVAL

Run memory check program.

optional arguments:
  -h, --help            show this help message and exit
  -n CHECK_NAME, --check-name CHECK_NAME
                        Health check name.
  -g PROCESS_GROUP, --process-group PROCESS_GROUP
                        Supervisor process group name.
  -N PROCESS_NAME, --process-name PROCESS_NAME
                        Supervisor process name. Process group argument is
                        ignored if this is passed in
  -p MAX_CPU, --max-cpu-percent MAX_CPU
                        Maximum CPU percent usage allowed to use by process
                        within time interval.
  -i INTERVAL, --interval INTERVAL
                        How long process is allowed to use CPU over threshold,

Configuration Examples

Restart process when it consumes more than 100% CPU within 30 minutes:

command=/usr/local/bin/supervisor_cpu_check -n example_check -p 100 -i 1800 -g example_service

Complex Check

Complex check (run multiple checks at once).


$ /usr/local/bin/supervisor_complex_check -h
usage: supervisor_complex_check [-h] -n CHECK_NAME -g PROCESS_GROUP -c

Run SupervisorD check program.

optional arguments:
  -h, --help            show this help message and exit
  -n CHECK_NAME, --check-name CHECK_NAME
                        Health check name.
  -g PROCESS_GROUP, --process-group PROCESS_GROUP
                        Supervisor process group name.
  -N PROCESS_NAME, --process-name PROCESS_NAME
                        Supervisor process name. Process group argument is
                        ignored if this is passed in
  -c CHECK_CONFIG, --check-config CHECK_CONFIG
                        Check config in JSON format

Example configuration

Here's example configuration using memory and http checks:

command=/usr/local/bin/supervisor_complex_check -n example_check -g example_service -c '{"memory":{"cumulative":true,"max_rss":4194304},"http":{"timeout":15,"port":8090,"url":"\/ping","num_retries":3}}'


This is inspired by Superlance plugin package.

Though, while Superlance is basically the set of feature-rich health check programs, supervisor_checks package is mostly focused on providing the framework to easily implement application-specific health checks of any complexity.

Bug reports

Please file here:

Or contact me directly:

Coverity Scan Build Status


Framework to build health checks for Supervisor-based services.

License:MIT License


Language:Python 100.0%