garbetjie / terraform-google-worker-pool

Terraform module to easily create Docker-based background workers.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Terraform Module: Google Workers

A Terraform module for the Google Cloud Platform that makes it easy to create and configure a pool of instances that run multiple background processors in Docker containers.

Table Of Contents

Introduction

Many systems require workers that process jobs in the background. Typically, these jobs are not time sensitive, and will take longer to run that what is acceptable for most user interfaces. Examples of these kinds of workers include queue processors, cron jobs and sending of emails.

This Terraform module makes it very simple & easy to create a Managed Instance Group that is able to run multiple workers per server. If any CloudSQL connections are required, a sidecar container is automatically created that will manage the connections to the database, and provide them as UNIX sockets available through a shared volume.

Requirements

Terminology

There are a few terms used throughout this module & documentation. The commonly-used terms are outlined below for full clarity:

  • worker

    A long-running process that handles and processes jobs.

  • timer

    A process that is run on a pre-determined schedule. This is similar to a cronjob.

  • instance

    A Google Compute Engine instance on which timers and workers are run.

  • pool

    A group of instances that are part of a managed instance group.

Usage

module worker {
  source = "garbetjie/worker-pool/google"
  
  name = "my-pool"
  location = "europe-west4"
  disk_size = 25
  disk_type = "pd-balanced"
  instance_count = 1
  labels = { "my-label" = "my label value" }
  machine_type = "f1-micro"
  metadata = { "my-key" = "my metadata value" }
  network = "default"
  preemptible = false
  runcmd = ["touch /tmp/touched"]
  service_account_email = "serviceAccount@my-project-id.iam.gserviceaccount.com"
  tags = ["my-tag"]
  timezone = "Etc/UTC"
  wait_for_instances = false
  
  workers = {
    image = "nginx:latest"
    replicas = 1
    args = ["nginx"]
    env = { "ENV_KEY" = "value" }
    user = "root"
    expose = [{ port = 80, container_port = 80, host = "0.0.0.0", protocol = "tcp" }]
    mounts = [{ src = "my-data", target = "/www", type = "volume", readonly = true }]
    restart_policy = "always"
    restart_interval = 3
    pre = [{ args = ["mkdir", "-p", "/www/dir"], image = "alpine:latest", user = "root" }]
  }
  
  timers = [{
    schedule = "minutely"
    args = ["curl", "$${PROTOCOL}://example.org"]
    image = "alpine:latest"
    user = null
    env = { PROTOCOL = "https" }
    mounts = []
  }]
  
  logging = {
    driver = "local"
    options = {}
  }
  
  cloudsql = {
    connections = ["my-project-id:europe-west4:my-instance"]
    wait_duration = 30
    mount_name = "cloudsql"
    mount_path = "/cloudsql"
    restart_policy = "always"
    restart_interval = 3
  }
  
  health_check = {
    enabled = true
    port = 4144
    name = "my-pool-health-check"
    interval = 10
    healthy_threshold = 3
    unhealthy_threshold = 3
    initial_delay = 60
  }
}

Inputs

Name Description Type Default Required
name Name of the pool. string Yes
location Zone or region in which to create the pool. string Yes
cloudsql CloudSQL configuration. object({ connections = set(string) }) { connections = [] } No
cloudsql.connections CloudSQL connections to establish before starting workers. set(string) Yes
cloudsql.wait_duration How long to wait (in seconds) for CloudSQL connections to be established before starting workers. number 30 No
cloudsql.mount_name Name of the volume created for mounting into workers. string "cloudsql" No
cloudsql.mount_path Path in the workers & timers to mount the volume containing CloudSQL connections. string "/cloudsql" No
cloudsql.restart_policy The restart policy to apply to the CloudSQL service. Must be one of ["always", "no", "on-success", "on-failure", "on-abnormal", "on-watchdog", "on-abort"]. string "always" No
cloudsql.restart_interval Number of seconds to wait before restarting the CloudSQL service if it stops. number 5 No
disk_size Disk size (in GB) to create instances with. number 25 No
disk_type Disk type to create instances with. Must be one of ["pd-ssd", "local-ssd", "pd-balanced", "pd-standard"]. string "pd-balanced" No
health_check Health check configuration. object({ enabled = bool }) { enabled = false } No
health_check.enabled Flag indicating whether the health check is enabled. bool Yes
health_check.port Host port that is exposed for the health check. number 4144 No
health_check.name Name to create the health check with. string "${var.name}-healthy" No
health_check.interval Interval between checks. number 10 No
health_check.healthy_threshold Number of consecutive health checks that must succeed for an instance to be marked as healthy. number 3 No
health_check.unhealthy_threshold Number of consecutive health checks that must fail for an instance to be marked as unhealthy. number 3 No
health_check.initial_delay Number of seconds to allow instances to boot before starting health checks. number 60 No
instance_count Number of instances to create in the pool. number 1 No
labels Labels to apply to all instances in the pool. map(string) {} No
logging Docker logging configuration. object({ driver = string }) { driver = "local" } No
logging.driver Driver to use as default. See https://docs.docker.com/config/containers/logging/configure/#supported-logging-drivers for supported drivers. string Yes
logging.options Options to specify for the configured driver. See the logging section for default options per driver. map(string) {} No
machine_type Machine type to create instances in the pool with. string "f1-micro" No
metadata Additional metadata to add to instances. Any metadata with the key "user-data" will be ignored. map(string) {} No
network Network name or link in which to create the pool. string "default" No
preemptible Whether or not to create preemptible instances. bool false No
runcmd Additional commands to run on instance startup. These commands are run after Docker is configured & restarted, and immediately before any workers & CloudSQL connections are started. list(string) [] No
service_account_email Service account to assign to the pool. string null No
tags Network tags to apply to instances in the pool. list(string) [] No
timers List of timers to run on a set schedule. list(object({ schedule = string })) [] No
timers.*.schedule The schedule on which this timer should run. The OnCalendar format is used. string Yes
timers.*.args Arguments to pass to the timer. See the notes about argument escaping for information on formatting. list(string) [] No
timers.*.image Docker image on which the timer is based. Defaults to the same image as specified in var.workers.image. string var.workers.image No
timers.*.user User under which to run the timer. Defaults to the same user as specified in var.workers.user. string var.workers.user No
timers.*.env Environment variables to inject into the timer. Defaults to those specified in var.workers.env. map(string) var.workers.env No
timers.*.mounts Volumes to mount into the timer container. Defaults to those specified in var.workers.mounts. See Worker inputs for full mount specification. object({ src = string, target = string, type = optional(string), readonly = optional(bool) }) var.workers.mounts No
timezone Timezone to use on instances. See the "TZ database name" column on https://en.wikipedia.org/wiki/List_of_tz_database_time_zones for an indication as to available timezone names. string "Etc/UTC" No
wait_for_instances Wait for instances to stabilise starting after updating the pool's instance group. bool false No
workers Worker configuration. object({ image = string, replicas = number }) Yes
workers.image Docker image on which the workers are based. string Yes
workers.replicas Number of workers to start up per instance. number Yes
workers.args Arguments to pass to workers. See the section on argument escaping for line break support. list(string) [] No
workers.env Environment variables to inject into workers. map(string) {} No
workers.user User to run workers as. Passed unmodified to the -u flag when running workers. string null No
workers.restart_policy Restart policy to apply to failed workers. Must be one of [always", "no", "on-success", "on-failure", "on-abnormal", "on-watchdog", "on-abort"]. string "always" No
workers.restart_interval Number of seconds to wait before restarting a failed worker. number 5 No
workers.expose Container ports to expose on the host. Should not be used if worker.replicas > 1. list(object({ port = number, protocol = string, container_port = number, host = string})) [] No
workers.expose.*.port Port on the host to map to the container. number Yes
workers.expose.*.container_port Port on the container to map to the host port. Defaults to port if not specified. number port No
workers.expose.*.host The IP address on which to bind the port on the host. string "0.0.0.0" No
workers.expose.*.protocol The protocol with which the port listens. Can be one of "tcp" or "udp". string "tcp" No
workers.pre Containers to execute immediately prior to starting up each worker. Can be used to prepare worker (changing permissions on directories, etc). list(object({ args = list(string), image = string, user = string })) [] No
workers.pre.*.args Arguments to pass to the init container.See the section on argument escaping for line break support. list(string) [] No
workers.pre.*.image Docker image to run the init container with. Defaults to the value of workers.image. string workers.image No
workers.pre.*.user User to run the init container as. string null No
workers.mounts Volumes to mount into the worker containers. list(object({ src = string, target = string, type = string, readonly = bool })) [] No
workers.mounts.*.src Source to mount into the worker containers. string Yes
workers.mounts.*.target Target path in the container to mount the volume or bind source. string Yes
workers.mounts.*.type Type of mount. Can be one of "volume" or "bind". string "volume" No
workers.mounts.*.readonly Whether or not the mount is writable. bool false No

CloudSQL

When specifying at least one CloudSQL connection in the var.cloudsql.connections input attribute, an instance of the CloudSQL proxy is started for you, and a volume containing configured UNIX sockets is created. This makes it easy to create CloudSQL connections for your workers.

A cloudsql.service systemd unit is generated, and is added as a hard requirement to workers and timers - if the CloudSQL service fails to start, no workers or timers will run. Additionally, a volume is automatically mounted into all workers and timers at the directory specified by var.cloudsql.mount_path (defaulting to /cloudsql) containing the UNIX sockets that can be used to connect to the specified CloudSQL instances.

This requires the roles/cloudsql.client IAM role to be populated on the service account the instances run as.

Please note: Only UNIX socket connections are currently supported. When var.cloudsql.connections is populated, there is a shell script that is run that forces workers and timers to wait until the connections have been established. This shell script only works with UNIX sockets at this point.

Timers

Using timers in systemd is the way to schedule tasks that need to be run on a regular basis - systemd's implementation of cron jobs. When specifying timers to be run, the following attributes will be inherited from the worker configuration if not specified: image, user, mounts, and env.

When specifying the schedule on which timers should run, it is the OnCalendar property that is populated. This page contains more information on the allowed formats of this property.

Argument escaping

Arguments passed in var.workers.args and var.timers.*.args are escaped, and can safely contain spaces, strings and any number of values.

Using regular expressions to accomplish this task would be too complex and error-prone for the wide range of unexpected inputs that might be required. So, as per https://stackoverflow.com/questions/28881758/how-can-i-use-spaces-in-systemd-command-line-arguments, exporting arguments as environment variables and using those environment variables in the argument list seems to be the generally accepted way of escaping arguments to systemd unit commands.

According to this pull request, \n in environment variable values will be expanded to newlines since systemd v239. Prior to this version, newlines are not supported in environment variables.

Logging

Both the logging driver, as well as any associated options can be configured on all instances in the pool Sensible defaults for the local and json-file log drivers are provided. The defaults for these drivers are provided below:

local

local = {
  max-size = "50m"
  max-file = "10"
  compress = "true"
}

json-file

json-file = {
  max-size = "50m"
  max-file = "10"
  compress = "true"
}

Health checks

Instances in the worker pool can be checked regularly to ensure they're still healthy and eligible to be in the pool.

Worker pool health checks ensure that all the requested workers are running (and the CloudSQL container if var.cloudsql.connections is not empty). If all workers are running, a port is opened on the instance for access by Google Cloud's health checks. If at least one worker is not running, this port will be closed - causing the health checks to start failing, and the unhealthy instance to be recreated in the worker pool.

Exposing workers

It is also possible to expose ports on workers. However, this is not possible when var.workers.replicas > 1. Simply populate the var.workers.expose attribute to begin exposing ports.

Below is an example of the minimum configuration required to expose nginx in a pool on port 80:

module my_nginx {
  source = "garbetjie/worker-pool/google"
  
  // ...
  workers = {
    image = "nginx:latest"
    replicas = 1
    expose = [{ port = 80 }]
  }
  // ...
}

Outputs

All inputs are exported as outputs. There are additional outputs as defined below:

Name Description Type
instance_group_manager_self_link Self link of the instance group manager. string
instance_group_self_link Self link of the instance group. string
instance_template_self_link Self link to the created instance template. string
regional Flag indicating if a regional instance group manager was created. bool
tag Unique tag generated for instance targeting in firewall rules. string

Changelog

  • 2.0.1

    • Replaced erroneous echo with exec in timer script.
  • 2.0.0

    • Completely refactor input configuration.
    • Default most attributes for timers to those supplied for workers.
  • 1.5.0

    • Add ability to specify user workers & timers run as.
  • 1.4.0

    • Add mounting of volumes into workers.
  • See CHANGELOG.md for a full history.

Roadmap

The points listed below are features that have been considered for possible inclusion in the future. There are no guarantees as to whether these features will ever make it into the final module:

  • Sidecar containers.

    The ability to run arbitrary containers as sidecars on each instance. Not sure whether these should be limited to using the top-level image/env (like timers), otherwise configuration becomes much trickier.

  • Autoscaling.

    Ideally, it would be great to be able to customise the instance group to scale automatically - either by custom Stackdriver metric, or by CPU usage. First prize would be to enable autoscaling by custom metric.

  • Waiting for CloudSQL: TCP ports

    Currently, when waiting for CloudSQL to start, only socket connections are checked. If CloudSQL is actually listening on a TCP connection, the script will never know.

  • Customising the health check port when ports are exposed.

    When ports are exposed, they're never taken into account when it comes to health checks. Perhaps the default health check port exposing should be used if no ports are exposed.

    If there are any ports exposed, then rather default to checking those ports?

About

Terraform module to easily create Docker-based background workers.

License:MIT License


Languages

Language:HCL 83.9%Language:Smarty 10.8%Language:Shell 5.2%