Documentation and scripts for running a full web application in a micro-services style. Many pieces are optional and could be swapped out to match others desires.
The unit files, scripts, and playbooks in the dist directory have been extracted from this document and pushed back to the repo.
- Alpine Linux is my prefered containerized OS, and the choice I've made for images
- CentOS is the chosen host operating system. Originally was CoreOS, but proved incompatible with Python for Ansible
- CentOS 7+ is required (Docker installed is incompatible with 6)
- Ansible is used for orchestration
- Metrics gathered by Prometheus
- Displayed with Grafana
- Dockerized nginx container to host static site
- TODO: Forward's nginx logs to the docker service, loggly logging strategy article
- Automatically reconfigures and refreshes nginx config based on routing configuration provided through etcd
- SSL termination
- By default forward http connections to https
- Have a configuration mode which allows initial letsencrypt validation over http
- Https certificate from letsencrypt with autorenewal
- Containerized Node.js server
- Publishes to etcd for discovery
- Expets DB proxy on localhost
- Oauth & Oauth2 Termination
- JWT generation and validation
- Dockerized RethinkDB
- Ansible based deployment and initialization
- Configure DNS
- Create tagged machine instances
- Create Ansible inventory (or use dynamic inventory script!)
- Firewall
The machine used for a controller will need SSH access to all the machines being managed. You can use one of the instances being managed, on GCE cloud shell is a handy resource to use. I'm personally running in GCE using wac-gce-ansible
Here is an example inventory. wac-bp operates on machines based on the group they belong to. You can manually create the inventory file with the hosts to manage, or use a dynamic inventory script for your cloud provider.
hostnameone
hostnametwo
[etcd]
hostnameone
[rethinkdb]
hostnameone
[frontend]
hostnametwo
[backend]
hostnametwo
[prometheus]
hostnametwo
The all variables file contains all the container versions to use.
---
# The host value which will be templated out for intra-machine connectivity. Match your manual inventory or dynamic inventory variable
internal_ip_name: private_ip
# The unique name of machine instances to be used in templates
machine_name: name
# Variables which get set into etcd (some of them are private!) needed by other applications
domain_name: <domain_name>
domain_email: <domain_email>
rethinkdb_web_password: <rethinkdb_web_password>
# Variables templated into the backend node process(es)
node_config:
backend:
jwt_token_secret: <jwt_token_secret>
google_client_id: <google_client_id>
google_redirect_uri: <google_redirect_uri>
google_auth_secret: <google_auth_secret>
# Location of the frontend app on the controller.
frontend_src_path: /home/frontend/src
# Location of source(s) on the controller for the nodejs process(es)
node_src_path:
backend: /home/backend/src
google_project_id: <Google PRoject Id>
jwt_token_secret: <GeneratedSEcret>
google_client_id: <Google Client Id>
google_redirect_uri: <Redirect URI>
google_auth_secret: <Google Auth Secret>
# google cloud functions and their properties
gcp_functions:
- name: auth
route: auth
src_path: /srcs/gcp-functional-auth
regions:
- us-east1
props:
jwt_token_secret: {{jwt_token_secret}}
google_client_id: {{google_client_id}}
google_redirect_uri: {{google_redirect_uri}}
google_auth_secret: {{google_auth_secret}}
projectid: {{google_project_id}}
- name: createUser
route: newUser
src_path: /srcs/gcp-functional-auth
regions:
- us-east1
props:
jwt_token_secret: {{jwt_token_secret}}
google_client_id: {{google_client_id}}
google_redirect_uri: {{google_redirect_uri}}
google_auth_secret: {{google_auth_secret}}
projectid: {{google_project_id}}
- name: deleteUser
route: deleteUser
src_path: /srcs/gcp-functional-auth
regions:
- us-east1
props:
jwt_token_secret: {{jwt_token_secret}}
google_client_id: {{google_client_id}}
google_redirect_uri: {{google_redirect_uri}}
google_auth_secret: {{google_auth_secret}}
projectid: {{google_project_id}}
- name: getUser
route: getUser
src_path: /srcs/gcp-functional-auth
regions:
- us-east1
props:
jwt_token_secret: {{jwt_token_secret}}
google_client_id: {{google_client_id}}
google_redirect_uri: {{google_redirect_uri}}
google_auth_secret: {{google_auth_secret}}
projectid: {{google_project_id}}
- name: deleteUser
route: deleteUser
src_path: /srcs/gcp-functional-auth
regions:
- us-east1
props:
jwt_token_secret: {{jwt_token_secret}}
google_client_id: {{google_client_id}}
google_redirect_uri: {{google_redirect_uri}}
google_auth_secret: {{google_auth_secret}}
projectid: {{google_project_id}}
# The controller machine directory to stage archives at
controller_src_staging: /home/staging
# Ports map
ports:
backend: 8080
etcd_peer: 2380
etcd_client: 2379
prometheus: 9090
grafana: 3000
node_exporter: 9100
nginx_prometheus_endpoint: 9145
# The container versions to use
rsync_version: latest
etcd_version: latest
nginx_version: latest
nginx_config_templater_version: latest
wac_acme_version: latest
nodejs_version: latest
prometheus_version: latest
node_exporter_version: latest
grafana_version: latest
The main playbook that deploys or updates a cluster
# Make sure Docker is installed
- hosts: all:!localhost
gather_facts: false
become: true
roles:
- docker
# Place a full etcd on the etcd hosts
- hosts: etcd
become: true
roles:
- { role: etcd, proxy_etcd: False, tags: [ 'etcd' ] }
# Set the etcd values (if required) from the first etcd host
- hosts: etcd[0]
become: true
roles:
- { role: populate_etcd, tags: [ 'etcd' ] }
# Place a proxy etcd everywhere except the etcd hosts
- hosts: all:!etcd:!localhost
become: true
roles:
- { role: etcd, proxy_etcd: True, tags: [ 'etcd' ] }
# Place Prometheus on the Prometheus hosts
- hosts: prometheus
become: true
roles:
- { role: prometheus, tags: [ 'prometheus' ] }
- role: discovery
vars:
parent: 'route_discovery'
service: prometheus
port: "{{ports['prometheus']}}"
service_properties:
private: 'true'
tags: [ 'prometheus' ]
# Place Grafana as the frontend on the Prometheus hosts
- hosts: prometheus
become: true
roles:
- { role: grafana, tags: [ 'grafana' ] }
- role: discovery
vars:
parent: 'route_discovery'
service: grafana
port: "{{ports['grafana']}}"
service_properties:
upstreamRoute: '/'
private: 'true'
tags: [ 'grafana' ]
# Place prometheus\node_exporter everywhere
- hosts: all:!localhost
become: true
roles:
- { role: prometheus-node-exporter, tags: [ 'prometheus_node_exporter' ] }
- name: Remove old staging directory
hosts: localhost
tasks:
- file:
path: "{{controller_src_staging}}"
state: absent
tags:
- frontend_application
- backend
# Recreate localhost staging directory
- name: Create local staging directory
hosts: localhost
tasks:
- file:
state: directory
path: "{{controller_src_staging}}"
tags:
- frontend_application
- backend
# nginx
- hosts: frontend
become: true
roles:
- { role: frontend, tags: [ 'frontend' ] }
# Default Backend nodejs process. Role can be applied additional times to different hosts with different configuration
- hosts: backend
become: true
roles:
- { role: nodejs, identifier: backend, nodejs_port: "{{ports['backend']}}", tags: [ 'backend' ] }
- role: discovery
vars:
parent: 'route_discovery'
service: backend_nodejs
port: "{{ports['backend']}}"
service_properties:
private: 'false'
tags: [ 'backend' ]
# Deploy the Google Cloud Functions from localhost
- hosts: localhost
become: true
roles:
- { role: gcp_functions, tags: [ 'functions' ] }
# Publish the Google Cloud Functions into etcd
- hosts: etcd[0]
become: true
roles:
- { role: gcp_functions_publishing, tags: [ 'functions' ] }
A helper playbook that queries the systemctl status of all wac-bp deployed units and displays them locally
# check on etcd
- hosts: all:!localhost
become: true
tasks:
- name: Check if etcd and etcd proxy is running
no_log: True
command: systemctl status etcd.service --lines=0
ignore_errors: yes
changed_when: false
register: service_etcd_status
- name: Report status of etcd
debug:
msg: "{{service_etcd_status.stdout.split('\n')}}"
# check on prometheus
- hosts: prometheus
become: true
tasks:
- name: Check if prometheus is running
no_log: True
command: systemctl status prometheus.service --lines=0
ignore_errors: yes
changed_when: false
register: service_prometheus_status
- name: Report status of prometheus
debug:
msg: "{{service_prometheus_status.stdout.split('\n')}}"
- name: Check if prometheus route-publishing is running
no_log: True
command: systemctl status prometheus-route-publishing.service --lines=0
ignore_errors: yes
changed_when: false
register: prometheus_route_publishing_status
- name: Report status of prometheus-route-publishing
debug:
msg: "{{prometheus_route_publishing_status.stdout.split('\n')}}"
# check on prometheus-node_exporter
- hosts: all:!localhost
become: true
tasks:
- name: Check if prometheus is running
no_log: True
command: systemctl status prometheus-node-exporter.service --lines=0
ignore_errors: yes
changed_when: false
register: service_prometheus_node_exporter_status
- name: Report status of prometheus-node_exporter
debug:
msg: "{{service_prometheus_node_exporter_status.stdout.split('\n')}}"
# check on grafana
- hosts: prometheus
become: true
tasks:
- name: Check if grafana is running
no_log: True
command: systemctl status grafana.service --lines=0
ignore_errors: yes
changed_when: false
register: service_grafana_status
- name: Report status of grafana
debug:
msg: "{{service_grafana_status.stdout.split('\n')}}"
# check on frontend services
- hosts: frontend
become: true
tasks:
- name: Check if nginx is running
no_log: True
command: systemctl status nginx.service --lines=0
ignore_errors: yes
changed_when: false
register: service_nginx_status
- name: Report status of nginx
debug:
msg: "{{service_nginx_status.stdout.split('\n')}}"
- name: Check if nginx-reload is running
no_log: True
command: systemctl status nginx-reload.path --lines=0
ignore_errors: yes
changed_when: false
register: service_nginx_reload_status
- name: Report status of nginx-reload
debug:
msg: "{{service_nginx_reload_status.stdout.split('\n')}}"
- name: Check if route-discovery-watcher is running
no_log: True
command: systemctl status route-discovery-watcher.service --lines=0
ignore_errors: yes
changed_when: false
register: service_route_discovery_watcher_status
- name: Report status of nginx-reload
debug:
msg: "{{service_route_discovery_watcher_status.stdout.split('\n')}}"
- name: Check if certificate-sync is running
no_log: True
command: systemctl status certificate-sync.service --lines=0
ignore_errors: yes
changed_when: false
register: service_certificate_sync_status
- name: Report status of certificate-sync
debug:
msg: "{{service_certificate_sync_status.stdout.split('\n')}}"
- name: Check if acme-response-watcher is running
no_log: True
command: systemctl status acme-response-watcher.service --lines=0
ignore_errors: yes
changed_when: false
register: service_acme_response_watcher_status
- name: Report status of acme-response-watcher
debug:
msg: "{{service_acme_response_watcher_status.stdout.split('\n')}}"
- name: Check if letsencrypt-renewal.timer is running
no_log: True
command: systemctl status letsencrypt-renewal.timer
ignore_errors: yes
changed_when: false
register: service_letsencrypt_renewal_status
- name: Report status of letsencrypt-renewal.timer
debug:
msg: "{{service_letsencrypt_renewal_status.stdout.split('\n')}}"
# check on backend services
- hosts: backend
become: true
tasks:
- name: Check if nodejs is running
no_log: True
command: systemctl status backend_nodejs.service --lines=0
ignore_errors: yes
changed_when: false
register: service_backend_nodejs_status
- name: Report status of nodejs
debug:
msg: "{{service_backend_nodejs_status.stdout.split('\n')}}"
- name: Check if nodejs route-publishing is running
no_log: True
command: systemctl status backend_route-publishing.service --lines=0
ignore_errors: yes
changed_when: false
register: service_backend_route_publishing_status
- name: Report status of nodejs route-publishing
debug:
msg: "{{service_backend_route_publishing_status.stdout.split('\n')}}"
The roles used by the playbooks above
Install Docker onto remote hosts
- name: Install Docker
shell: curl -fsSL https://get.docker.com/ | sh
- name: Ensure Docker is started
systemd:
daemon_reload: yes
enabled: yes
state: started
name: docker.service
- name: Pull desired etcd image
shell: docker pull chadautry/wac-etcdv2:{{etcd_version}}
- name: Install etcdctl from docker image
shell: docker run --rm -v /usr/bin:/hostusrbin --entrypoint cp chadautry/wac-etcdv2:{{etcd_version}} /bin/etcdctl /hostusrbin
Deploys or redeploys the etcd instance on a host. Etcd is persistent, but if the cluster changes wac-bp blows it away instead of attempting to add/remove instances. Deploys a full instance or proxy instance depending on the variable passed
# template out the systemd etcd.service unit on the etcd hosts
- name: etcd template
template:
src: etcd.service
dest: /etc/systemd/system/etcd.service
register: etcd_template
- name: wipe out etcd directory
file:
state: absent
path: /var/etcd
when: etcd_template is changed
- name: ensure etcd directory is present
file:
state: directory
path: /var/etcd
when: etcd_template is changed
- name: start/restart the etcd.service if template changed
systemd:
daemon_reload: yes
enabled: yes
state: restarted
name: etcd.service
when: etcd_template is changed
- name: Ensure etcd is started, even if the template didn't change
systemd:
daemon_reload: yes
enabled: yes
state: started
name: etcd.service
when: not (etcd_template is changed)
roles/etcd/templates/etcd.service
[Unit]
Description=etcd
# Dependencies
Requires=docker.service
# Ordering
After=docker.service
[Service]
ExecStartPre=-/usr/bin/docker pull chadautry/wac-etcdv2:{{etcd_version}}
ExecStartPre=-/usr/bin/docker rm etcd
ExecStart=/usr/bin/docker run --name etcd --net host -p 2380:2380 -p 2379:2379 \
-v /var/etcd:/var/etcd \
chadautry/wac-etcdv2:{{etcd_version}} \
{% if not proxy_etcd %}
--initial-advertise-peer-urls http://{{hostvars[inventory_hostname][internal_ip_name]}}:2380 \
--listen-peer-urls http://{{hostvars[inventory_hostname][internal_ip_name]}}:2380 \
--advertise-client-urls http://{{hostvars[inventory_hostname][internal_ip_name]}}:2379 \
--data-dir /var/etcd \
--initial-cluster-state new \
{% endif %}
{% if proxy_etcd %}
--proxy on \
{% endif %}
--name {{hostvars[inventory_hostname][machine_name]}} \
--listen-client-urls http://{{hostvars[inventory_hostname][internal_ip_name]}}:2379,http://127.0.0.1:2379 \
--initial-cluster {% for host in groups['etcd'] %}{{hostvars[host][machine_name]}}=http://{{hostvars[host][internal_ip_name]}}:2380{% if not loop.last %},{% endif %}{% endfor %}
Restart=always
[Install]
WantedBy=multi-user.target
- requires docker
- takes version from etcd_version variable
- writes different lines for proxy mode or standard
- uses the internal ip variable configured
- walks the etcd hosts for the initial cluster
This role sets values into etcd from the Ansible config when the etcd cluster has been recreated. It only needs to be executed from a single etcd machine.
roles/populate_etcd/tasks/main.yml
# Condititionally import the populate.yml, so we don't have to see all the individual set tasks excluded in the output
- include: populate.yml
static: no
when: (etcd_template is defined and etcd_template is changed) or (force_populate_etcd is defined)
roles/populate_etcd/tasks/populate.yml
- name: /usr/bin/etcdctl set /domain/name <domain>
command: /usr/bin/etcdctl set /domain/name {{domain_name}}
- name: /usr/bin/etcdctl set /domain/email <email>
command: /usr/bin/etcdctl set /domain/email {{domain_email}}
- name: /usr/bin/etcdctl set /rethinkdb/pwd <Web Authorization Password>
command: /usr/bin/etcdctl set /rethinkdb/pwd {{rethinkdb_web_password}}
The discovery publishing role is used to publish other services into etcd It takes the service name, a parent path name, an optional group name, and a set of properties It publishes the info into etcd for disocovery by other services
roles/discovery/tasks/main.yml
# Template out the discovery publishing systemd unit
- name: "{{service}}_{{parent}}_{{port}}-publishing.service template"
template:
src: publishing.service
dest: /etc/systemd/system/{{service}}_{{parent}}_{{port}}-publishing.service
register: discovery_publishing_service_template
# Start/restart the discovery publisher when discoverable and template changed
- name: start/restart the discoverable-publishing.service
systemd:
daemon_reload: yes
enabled: yes
state: restarted
name: "{{service}}_{{parent}}_{{port}}-publishing.service"
when: discovery_publishing_service_template | changed
# Ensure the discovery publisher is started even if template did not change
- name: start/restart the route-publishing.service
systemd:
daemon_reload: yes
enabled: yes
state: started
name: "{{service}}_{{parent}}_{{port}}-publishing.service"
when: not (discovery_publishing_service_template | changed)
Publishes the backend host into etcd at an expected path for the frontend to route to
roles/discovery/templates/publishing.service
[Unit]
Description={{service}} {{parent}} {{port}} Discovery Publishing
# Dependencies
Requires=etcd.service
Requires={{service}}.service
# Ordering
After=etcd.service
After={{service}}.service
# Restart when dependency restarts
PartOf=etcd.service
PartOf={{service}}.service
[Service]
ExecStart=/bin/sh -c "while true; do etcdctl set /{{parent}}/{{service}}/services/%H_{{port}}/host '{{hostvars[inventory_hostname][internal_ip_name]}}' --ttl 60; \
etcdctl set /{{parent}}/{{service}}/services/%H_{{port}}/port '{{port}}' --ttl 60; \
{% if service_local_properties is defined %}
{% for key in service_local_properties %}
etcdctl set /{{parent}}/{{service}}/services/%H_{{port}}/{{key}} '{{service_local_properties[key]}}' --ttl 60; \
{% endfor %}
{% endif %}
{% if service_properties is defined %}
{% for key in service_properties %}
etcdctl set /{{parent}}/{{service}}/{{key}} '{{service_properties[key]}}' --ttl 60; \
{% endfor %}
{% endif %}
sleep 45; \
done"
ExecStartPost=-/bin/sh -c '/usr/bin/etcdctl set /{{parent}}/watched "$(date +%s%N)"'
ExecStartPost=-/bin/sh -c '/usr/bin/etcdctl set /route_discovery/watched "$(date +%s%N)"'
ExecStop=/usr/bin/etcdctl rm /{{parent}}/{{service}}/services/%H_{{port}}
[Install]
WantedBy=multi-user.target
- requires etcd
- Publishes service's info into etcd every 45 seconds with a 60 second duration
- Deletes service info from etcd on stop
- Is restarted if etcd or the service restarts
The prometheus playbook templates out the prometheus config and sets up the prometheus unit and route discovery
roles/prometheus/tasks/main.yml
# Ensure the prometheus directories are created
- name: ensure prometheus directory is present
file:
state: directory
path: /var/prometheus
- name: ensure prometheus config directory is present
file:
state: directory
path: /var/prometheus/config
- name: ensure prometheus data directory is present
file:
state: directory
path: /var/prometheus/data
# template out the prometheus config
- name: prometheus/config template
template:
src: prometheus.yml
dest: /var/prometheus/config/prometheus.yml
register: prometheus_config
# template out the systemd prometheus.service unit
- name: prometheus.service template
template:
src: prometheus.service
dest: /etc/systemd/system/prometheus.service
register: prometheus_service_template
- name: start/restart prometheus.service if template or config changed
systemd:
daemon_reload: yes
enabled: yes
state: restarted
name: prometheus.service
when: (prometheus_service_template | changed) or (prometheus_config | changed)
- name: ensure prometheus.service is started, even if the template or config didn't change
systemd:
daemon_reload: yes
enabled: yes
state: started
name: prometheus.service
when: not ((prometheus_service_template | changed) or (prometheus_config | changed))
roles/prometheus/templates/prometheus.yml
global:
scrape_interval: 15s # By default, scrape targets every 15 seconds.
# Attach these labels to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
external_labels:
monitor: 'codelab-monitor'
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# Override the global default and scrape targets from this job every 5 seconds.
scrape_interval: 5s
# Prometheus sets its app context in response to setting web.external-url
metrics_path: /prometheus/metrics
static_configs:
- targets: ['localhost:9090']
- job_name: 'etcd'
static_configs:
- targets: [{% for host in groups['all'] | difference(['localhost']) %}'{{hostvars[host][internal_ip_name]}}:2379'{% if not loop.last %},{% endif %}{% endfor %} ]
- job_name: 'nginx'
static_configs:
- targets: [{% for host in groups['frontend'] %}'{{hostvars[host][internal_ip_name]}}:9145'{% if not loop.last %},{% endif %}{% endfor %} ]
- job_name: 'node_exporter'
static_configs:
- targets: [{% for host in groups['all'] | difference(['localhost']) %}'{{hostvars[host][internal_ip_name]}}:9100'{% if not loop.last %},{% endif %}{% endfor %} ]
roles/prometheus/templates/prometheus.service
[Unit]
Description=Prometheus
# Dependencies
Requires=docker.service
# Ordering
After=docker.service
[Service]
ExecStartPre=-/usr/bin/docker pull chadautry/wac-prometheus:{{prometheus_version}}
ExecStartPre=-/usr/bin/docker rm prometheus
ExecStart=/usr/bin/docker run --name prometheus -p 9090:9090 \
-v /var/prometheus:/var/prometheus \
chadautry/wac-prometheus:{{prometheus_version}} \
--web.external-url https://{{domain_name}}/prometheus
Restart=always
[Install]
WantedBy=multi-user.target
- requires docker
- Starts a customized prometheus docker container
- Version comes from variables
- Takes config from local drive
- Saves data to local drive
The grafana playbook templates out the grafana config and sets up the grafana unit and route discovery
# Ensure the grafana directories are created
- name: ensure grafana directory is present
file:
state: directory
path: /var/grafana
- name: ensure /var/grafana/config is present
file:
state: directory
path: /var/grafana/config
- name: ensure /var/grafana/provisioning is present
file:
state: directory
path: /var/grafana/provisioning
- name: ensure /var/grafana/provisioning/datasources is present
file:
state: directory
path: /var/grafana/provisioning/datasources
- name: ensure /var/grafana/provisioning/dashboards is present
file:
state: directory
path: /var/grafana/provisioning/dashboards
# template out the grafana config
- name: grafana config template
template:
src: config.ini
dest: /var/grafana/config/config.ini
register: grafana_config
# template out the prometheus datasource
- name: grafana config template
template:
src: datasource.yml
dest: /var/grafana/provisioning/datasources/datasource.yml
register: grafana_datasource
# template out the systemd grafana.service unit
- name: grafana.service template
template:
src: grafana.service
dest: /etc/systemd/system/grafana.service
register: grafana_service_template
- name: start/restart grafana.service if template, config, or datasource changed
systemd:
daemon_reload: yes
enabled: yes
state: restarted
name: grafana.service
when: (grafana_service_template | changed) or (grafana_config | changed) or (grafana_datasource | changed)
- name: ensure grafana.service is started, even if the template or config didn't change
systemd:
daemon_reload: yes
enabled: yes
state: started
name: grafana.service
when: not ((grafana_service_template | changed) or (grafana_config | changed) or (grafana_datasource | changed))
roles/grafana/templates/datasource.yml
datasources:
- access: 'proxy'
editable: true
is_default: true
name: 'prom1'
org_id: 1
type: 'prometheus'
url: 'http://{{internal_ip_name}}:9090'
version: 1
roles/grafana/templates/config.ini
[paths]
provisioning = /var/grafana/provisioning
[server]
domain = {{domain_name}}
root_url = %(protocol)s://%(domain)s:%(http_port)s/grafana/
[auth.proxy]
enabled = true
header_name = X-WEBAUTH-USER
header_property = username
auto_sign_up = true
roles/grafana/templates/grafana.service
[Unit]
Description=Grafana
# Dependencies
Requires=docker.service
# Ordering
After=docker.service
[Service]
ExecStartPre=-/usr/bin/docker pull chadautry/wac-grafana:{{grafana_version}}
ExecStartPre=-/usr/bin/docker rm grafana
ExecStart=/usr/bin/docker run --name grafana -p 3000:3000 \
-v /var/grafana:/var/grafana \
chadautry/wac-grafana:{{grafana_version}} -config /var/grafana/config/config.ini
Restart=always
[Install]
WantedBy=multi-user.target
- requires docker
- Starts a customized grafana docker container
- Version comes from variables
- Takes config from local drive
- Saves data to local drive
Deploys or redeploys the prometheus/node_exporter instance on a host.
roles/prometheus-node-exporter/tasks/main.yml
# template out the systemd prometheus-node-exporter.service unit on the etcd hosts
- name: etcd template
template:
src: prometheus-node-exporter.service
dest: /etc/systemd/system/prometheus-node-exporter.service
register: node_exporter_template
- name: start/restart the prometheus-node-exporter.service if template changed
systemd:
daemon_reload: yes
enabled: yes
state: restarted
name: prometheus-node-exporter.service
when: node_exporter_template | changed
- name: Ensure prometheus-node-exporter.service is started, even if the template didn't change
systemd:
daemon_reload: yes
enabled: yes
state: started
name: prometheus-node-exporter.service
when: not (node_exporter_template | changed)
roles/prometheus-node-exporter/templates/prometheus-node-exporter.service
[Unit]
Description=etcd
# Dependencies
Requires=docker.service
# Ordering
After=docker.service
[Service]
ExecStartPre=-/usr/bin/docker pull chadautry/wac-prometheus-node_exporter:{{node_exporter_version}}
ExecStartPre=-/usr/bin/docker rm node_exporter
ExecStart=/usr/bin/docker run --name node_exporter -p 9100:9100 -v "/proc:/host/proc" \
-v "/sys:/host/sys" -v "/:/rootfs" --net="host" chadautry/wac-prometheus-node_exporter:{{node_exporter_version}}
Restart=always
[Install]
WantedBy=multi-user.target
- requires docker
- takes version from node_exporter_version variable
The front end playbook sets up the nginx unit, the nginx file watching & reloading units, the letsencrypt renewal units, and finally pushes the front end application across (tagged so it can be executed alone)
# Ensure the frontend directories are created
- name: ensure www directory is present
file:
state: directory
path: /var/www
- name: ensure nginx directory is present
file:
state: directory
path: /var/nginx
- name: ensure ssl directory is present
file:
state: directory
path: /var/ssl
# Import backend route configurator (creates config before nginx starts)
- include: route-discovery-watcher.yml
# Import nginx task file
- include: nginx.yml
# Import ssl related tasks
- include: ssl.yml
# Import application push task
- include: application.yml
Nginx hosts static files, routes to instances (backends and databases), and terminates SSL according to its configuration
roles/frontend/tasks/nginx.yml
# template out the systemd nginx-config-templater.service unit
- name: nginx-config-templater.service template
template:
src: nginx-config-templater.service
dest: /etc/systemd/system/nginx-config-templater.service
# template out the systemd nginx-reload.service unit
- name: nginx-reload.service template
template:
src: nginx-reload.service
dest: /etc/systemd/system/nginx-reload.service
# template out the systemd nginx-reload.path unit
- name: nginx-reload.path template
template:
src: nginx-reload.path
dest: /etc/systemd/system/nginx-reload.path
- name: Start nginx-reload.path
systemd:
daemon_reload: yes
enabled: yes
state: restarted
name: nginx-reload.path
# template out the systemd nginx.service unit
- name: nginx.service template
template:
src: nginx.service
dest: /etc/systemd/system/nginx.service
register: nginx_template
- name: start/restart nginx.service if template changed
systemd:
daemon_reload: yes
enabled: yes
state: restarted
name: nginx.service
when: nginx_template | changed
- name: Ensure nginx.service is started, even if the template didn't change
systemd:
daemon_reload: yes
enabled: yes
state: started
name: nginx.service
when: not (nginx_template | changed)
roles/frontend/templates/nginx.service
[Unit]
Description=NGINX
# Dependencies
Requires=docker.service
# Ordering
After=docker.service
[Service]
ExecStartPre=-/usr/bin/docker pull chadautry/wac-nginx:{{nginx_version}}
ExecStartPre=-/usr/bin/docker rm nginx
ExecStart=/usr/bin/docker run --name nginx -p 80:80 -p 443:443 -p 9145:9145 \
-v /var/www:/usr/share/nginx/html:ro -v /var/ssl:/etc/nginx/ssl:ro \
-v /var/nginx:/usr/var/nginx:ro \
chadautry/wac-nginx:{{nginx_version}}
Restart=always
[Install]
WantedBy=multi-user.target
- requires docker
- Starts a customized nginx docker container
- Version comes from variables
- Takes server config from local drive
- Takes html from local drive
- Takes certs from local drive
A oneshot unit is used to template nginx's config. Prevents conflicts from multiple units executing the action
roles/frontend/templates/nginx-config-templater.service
[Unit]
Description=NGINX config templater service
[Service]
ExecStartPre=-/usr/bin/docker pull chadautry/wac-nginx-config-templater:{{nginx_config_templater_version}}
ExecStartPre=-/usr/bin/docker rm nginx-templater
ExecStart=-/usr/bin/docker run --name nginx-templater --net host \
-v /var/nginx:/usr/var/nginx -v /var/ssl:/etc/nginx/ssl:ro \
chadautry/wac-nginx-config-templater:{{nginx_config_templater_version}}
Type=oneshot
- Removes the previous named docker container
- Executes the docker container to template out the nginx config
- Local volume mapped in for the templated config to be written to
- Local ssl volume mapped in for the template container to read
- Doesn't error out
- It is a one shot which expects to be called by other units (safely concurrent)
A pair of units are responsible for watching the nginx configuration and reloading the service
roles/frontend/templates/nginx-reload.service
[Unit]
Description=NGINX reload service
[Service]
ExecStart=-/usr/bin/docker kill -s HUP nginx
Type=oneshot
- Sends a signal to the named nginx container to reload
- Ignores errors
- It is a one shot which expects to be called by other units
roles/frontend/templates/nginx-reload.path
[Unit]
Description=NGINX reload path
[Path]
PathChanged=/var/nginx/nginx.conf
PathChanged=/var/ssl/fullchain.pem
[Install]
WantedBy=multi-user.target
- Watches config file
- Watches the (last copied) SSL cert file
- Automatically calls nginx-reload.service on change (because of matching unit name)
Sets a watch on the backend discovery location, and when it changes templates out the nginx conf
TODO Why isn't this a part of the nginx task? roles/frontend/tasks/route-discovery-watcher.yml
# template out the systemd service unit
- name: route-discovery-watcher.service template
template:
src: route-discovery-watcher.service
dest: /etc/systemd/system/route-discovery-watcher.service
register: route_discovery_watcher_template
- name: start/restart the service if template changed
systemd:
daemon_reload: yes
enabled: yes
state: restarted
name: route-discovery-watcher.service
when: route_discovery_watcher_template | changed
- name: Ensure the service is started, even if the template didn't change
systemd:
daemon_reload: yes
enabled: yes
state: started
name: route-discovery-watcher.service
when: not (route_discovery_watcher_template | changed)
roles/frontend/templates/route-discovery-watcher.service
[Unit]
Description=Watches for nginx routes
# Dependencies
Requires=etcd.service
# Ordering
After=etcd.service
# Restart when dependency restarts
PartOf=etcd.service
[Service]
ExecStartPre=-/bin/sh -c '/usr/bin/etcdctl mk /route_discovery/watched "$(date +%s%N)"'
ExecStart=/usr/bin/etcdctl watch /route_discovery/watched
ExecStartPost=-/usr/bin/systemctl start nginx-config-templater.service
Restart=always
[Install]
WantedBy=multi-user.target
- Restarted if etcd restarts
- Starts a watch for changes in the backend discovery path
- Once the watch starts, executes the config templating systemd unit
- If the watch is ever satisfied, the unit will exit
- Automatically restarted, causing a new watch and templater execution
The SSL certificate is requested from letsencrypt
roles/frontend/tasks/acme-response-watcher.yml
# template out the systemd certificate-sync.service unit
- name: certificate-sync.service template
template:
src: certificate-sync.service
dest: /etc/systemd/system/certificate-sync.service
register: certificate_sync_template
- name: start/restart the certificate-sync.service if template changed
systemd:
daemon_reload: yes
enabled: yes
state: restarted
name: certificate-sync.service
when: certificate_sync_template | changed
- name: Ensure certificate-sync.service is started, even if the template didn't change
systemd:
daemon_reload: yes
enabled: yes
state: started
name: certificate-sync.service
when: not (certificate_sync_template | changed)
# template out the systemd acme-response-watcher.service unit
- name: acme-response-watcher.service template
template:
src: acme-response-watcher.service
dest: /etc/systemd/system/acme-response-watcher.service
register: acme_response_watcher_template
- name: start/restart the acme-response-watcher.service if template changed
systemd:
daemon_reload: yes
enabled: yes
state: restarted
name: acme-response-watcher.service
when: acme_response_watcher_template | changed
- name: Ensure acme-response-watcher.service is started, even if the template didn't change
systemd:
daemon_reload: yes
enabled: yes
state: started
name: acme-response-watcher.service
when: not (acme_response_watcher_template | changed)
# template out the systemd letsencrypt renewal units
- name: letsencrypt-renewal.service template
template:
src: letsencrypt-renewal.service
dest: /etc/systemd/system/letsencrypt-renewal.service
- name: letsencrypt-renewal.timer template
template:
src: letsencrypt-renewal.timer
dest: /etc/systemd/system/letsencrypt-renewal.timer
register: letsencrpyt_renewal_template
- name: start/restart the letsencrypt-renewal.timer if template changed
systemd:
daemon_reload: yes
enabled: yes
state: restarted
name: letsencrypt-renewal.timer
when: letsencrpyt_renewal_template | changed
- name: Ensure letsencrypt-renewal.timer is started, even if the template didn't change
systemd:
daemon_reload: yes
enabled: yes
state: started
name: letsencrypt-renewal.timer
when: not (letsencrpyt_renewal_template | changed)
- name: Execute the renewal oneshot on deploy
systemd:
daemon_reload: yes
enabled: no
state: started
name: letsencrypt-renewal.service
This unit takes the SSL certificates from etcd, and writes them to the local system. It also atempts to set the local certificate back into etcd (for whenever etcd is reset due to a cluster change)
roles/frontend/templates/certificate-sync.service
[Unit]
Description=SSL Certificate Syncronization
# Dependencies
Requires=etcd.service
# Ordering
After=etcd.service
# Restart when dependency restarts
PartOf=etcd.service
[Service]
ExecStartPre=-/bin/sh -c '/usr/bin/etcdctl mk -- /ssl/server_chain "$(cat /var/ssl/chain.pem)"'
ExecStartPre=-/bin/sh -c '/usr/bin/etcdctl mk -- /ssl/key "$(cat /var/ssl/privkey.pem)"'
ExecStartPre=-/bin/sh -c '/usr/bin/etcdctl mk -- /ssl/server_pem "$(cat /var/ssl/fullchain.pem)"'
ExecStartPre=-/bin/sh -c '/usr/bin/etcdctl mk /ssl/watched "$(date +%s%N)"'
ExecStart=/usr/bin/etcdctl watch /ssl/watched
ExecStartPost=/bin/sh -c '/usr/bin/etcdctl get /ssl/server_chain > /var/ssl/chain.pem'
ExecStartPost=/bin/sh -c '/usr/bin/etcdctl get /ssl/key > /var/ssl/privkey.pem'
ExecStartPost=/bin/sh -c '/usr/bin/etcdctl get /ssl/server_pem > /var/ssl/fullchain.pem'
Restart=always
[Install]
WantedBy=multi-user.target
- Sets the local certs into etcd if they don't exist there
- -- Means there are no more command options, needed since the cert files start with '-'
- Creates /ssl/watched if it doesn't exist, so the unit has something to watch
- Starts a watch for SSL cert changes to copy
- Once the watch starts, copy the current certs
- If the watch is ever satisfied, the unit will exit
- Automatically restarted, causing a new watch and copy
This unit watches etcd for the acme challenge response, and then calls the nginx-config-templater (the templater writes the response into the nginx config) roles/frontend/templates/acme-response-watcher.service
[Unit]
Description=Watches for distributed acme challenge responses
# Dependencies
Requires=etcd.service
# Ordering
After=etcd.service
[Service]
ExecStart=/usr/bin/etcdctl watch /acme/watched
ExecStartPost=-/usr/bin/systemctl start nginx-config-templater.service
Restart=always
[Install]
WantedBy=multi-user.target
- Starts a watch for changes in the acme challenge response
- Once the watch starts, executes the config templating oneshot
- If the watch is ever satisfied, the unit will exit
- Automatically restarted, causing a new watch and templater execution
A pair of units are responsible for initiating the letsencrypt renewal process. The process executes daily but will not renew until there are less than 30 days remaining till it expires
roles/frontend/templates/letsencrypt-renewal.service
[Unit]
Description=Letsencrpyt renewal service
[Service]
ExecStartPre=-/usr/bin/docker pull chadautry/wac-acme:{{wac_acme_version}}
ExecStartPre=-/usr/bin/docker rm acme
ExecStart=-/usr/bin/docker run --net host -v /var/ssl:/var/ssl --name acme chadautry/wac-acme:{{wac_acme_version}}
Type=oneshot
- Calls the wac-acme container to create/renew the cert
- Container takes domain name and domain admin e-mail from etcd
- Container interacts with other units/containers to manage the renewal process through etcd
- Ignores errors
- It is a one shot which expects to be called by the timer unit
- Metadata will cause it to be made available on all frontend servers when loaded
- It technically could run anywhere with etcd, just limiting its loaded footprint
roles/frontend/templates/letsencrypt-renewal.timer
[Unit]
Description=Letsencrpyt renewal timer
[Timer]
OnCalendar=*-*-* 05:00:00
RandomizedDelaySec=1800
[Install]
WantedBy=multi-user.target
- Executes daily at 5:00 (to avoid DST issues)
- Has a 30 minute randomized delay, so multiple copies don't all try to execute at once (though the docker image itself will exit if another is already running)
- Automagically executes the letsencrypt-renewal.service based on name
This task include takes the static front end application and pushes it across to instances
roles/frontend/tasks/application.yml
# Create archive of frontend content to transfer
- name: archive frontend on localhost
local_action: archive
args:
path: "{{frontend_src_path}}"
dest: "{{controller_src_staging}}/frontendsrc.tgz"
become: false
run_once: true
tags: frontend_application
- name: Remove old webapp staging
file:
path: /var/staging/webapp
state: absent
tags: frontend_application
- name: Ensure remote staging dir exists
file:
path: /var/staging
state: directory
tags: frontend_application
- name: Copy over application
copy:
src: "{{controller_src_staging}}/frontendsrc.tgz"
dest: /var/staging/frontendsrc.tgz
tags: frontend_application
- name: Unpack webapp to staging
unarchive:
src: "/var/staging/frontendsrc.tgz"
dest: /var/staging
remote_src: true
tags: frontend_application
- name: Pull alpine-rsync image
command: /usr/bin/docker pull chadautry/alpine-rsync:{{rsync_version}}
tags: frontend_application
- name: sync staging and /var/www
command: /usr/bin/docker run -v /var/staging:/var/staging -v /var/www:/var/www --rm chadautry/alpine-rsync:{{rsync_version}} -a /var/staging/webapp/ /var/www
tags: frontend_application
This role sets up a nodejs unit, the discovery unit, and finally pushes the source application across (tagged so it can be executed alone). Configureable for hosting multiple nodejs processes with multiple disocvered routes
# Ensure the backend directories are created
- name: ensure application directory is present
file:
state: directory
path: /var/nodejs/{{identifier}}
# Deploy the process's application source
- include: application.yml
# Template out the nodejs config
- name: config.js template
template:
src: config.js
dest: /var/nodejs/{{identifier}}/config.js
# Template out the nodejs systemd unit
- name: nodejs.service template
template:
src: nodejs.service
dest: /etc/systemd/system/{{identifier}}_nodejs.service
# Always restart the nodejs server
- name: start/restart the nodejs.service
systemd:
daemon_reload: yes
enabled: yes
state: restarted
name: "{{identifier}}_nodejs.service"
This task include takes the static application source and pushes it across to instances
roles/nodejs/tasks/application.yml
# Create archive of application files to transfer
- name: archive application on localhost
local_action: archive
args:
path: "{{node_src_path[identifier]}}/*"
dest: "{{controller_src_staging}}/{{identifier}}src.tgz"
become: false
run_once: true
- name: Remove old nodejs staging
file:
path: /var/staging/{{identifier}}
state: absent
- name: Ensure nodejs staging dir exists
file:
path: /var/staging/{{identifier}}
state: directory
- name: Transfer nodejs application archive
copy:
src: "{{controller_src_staging}}/{{identifier}}src.tgz"
dest: /var/staging
# Using the unarchive module caused errors. Presumably due to the large number of files in node_modules
- name: Unpack nodejs application archive
command: /bin/tar --extract -C /var/staging/{{identifier}} -z -f /var/staging/{{identifier}}src.tgz
args:
warn: no
- name: Pull alpine-rsync image
command: /usr/bin/docker pull chadautry/alpine-rsync:{{rsync_version}}
- name: sync staging and /var/nodejs
command: /usr/bin/docker run -v /var/staging/{{identifier}}:/var/staging/{{identifier}} -v /var/nodejs/{{identifier}}:/var/nodejs/{{identifier}} --rm chadautry/alpine-rsync:{{rsync_version}} -a /var/staging/{{identifier}}/ /var/nodejs/{{identifier}}
The template for the nodejs server's config roles/nodejs/templates/config.js
module.exports = {
PORT: 80,
{% for key in node_config[identifier] %}{{key|upper}}: '{{node_config[identifier][key]}}'{% if not loop.last %},{% endif %}{% endfor %}
};
The main application unit, it is simply a docker container with Node.js installed and the code to be executed mounted inside
roles/nodejs/templates/nodejs.service
[Unit]
Description=NodeJS Backend API
# Dependencies
Requires=docker.service
# Ordering
After=docker.service
[Service]
ExecStartPre=-/usr/bin/docker pull chadautry/wac-node
ExecStartPre=-/usr/bin/docker rm -f {{identifier}}-node-container
ExecStart=/usr/bin/docker run --name {{identifier}}-node-container -p {{nodejs_port}}:80 \
-v /var/nodejs/{{identifier}}:/app:ro \
chadautry/wac-node '%H'
ExecStop=-/usr/bin/docker stop {{identifier}}-node-container
Restart=always
[Install]
WantedBy=multi-user.target
- requires docker
- Starts a customized nodejs docker container
- Takes the app and configuration from local drive
This role templates out config and deploys a Google Cloud Function
roles/gcp_functions/tasks/main.yml
# Template out the function's config, should have config.js in the .gitignore
- name: config.js template
template:
src: config.js
dest: "{{item.src_path}}/config.js"
loop: "{{ gcp_functions }}"
when: "single_function is not defined or single_function == item.name"
# Deploy the process's application source
- name: Deploy function
command: gcloud functions deploy {{item.0.name}} --runtime nodejs8 --region={{item.1}} --trigger-http
args:
chdir: "{{item.0.src_path}}"
loop: "{{ gcp_functions|subelements('regions') }}"
when: "single_function is not defined or single_function == item.0.name"
The template for the nodejs server's config roles/gcp_functions/templates/config.js
module.exports = {
{% for key in item.props %}{{key|upper}}: '{{item.props[key]}}'{% if not loop.last %},{% endif %}{% endfor %}
};
The static discovery publishing role is used to publish cloud function routes into etcd Cloud function URL is like http://YOUR_REGION-YOUR_PROJECT_ID.cloudfunctions.net/FUNCTION_NAME
roles/gcp_functions_publishing/tasks/main.yml
- name: Push the function domain for the function route
command: "/usr/bin/etcdctl set /route_discovery/{{item.0.route}}/services/{{item.1}}{{item.0.route}}/host '{{item.1}}-{{google_project_id}}.cloudfunctions.net'"
loop: "{{ gcp_functions|subelements('regions') }}"
when: "single_function is not defined or single_function == item.0.name"
- name: Push the function port for the function route
command: "/usr/bin/etcdctl set /route_discovery/{{item.0.route}}/services/{{item.1}}{{item.0.route}}/port '443'"
loop: "{{ gcp_functions|subelements('regions') }}"
when: "single_function is not defined or single_function == item.0.name"
- name: Push single upstream host as the host header #TODO really allow multi region
command: "/usr/bin/etcdctl set /route_discovery/{{item.0.route}}/proxyHostHeader '{{item.1}}-{{google_project_id}}.cloudfunctions.net'"
loop: "{{ gcp_functions|subelements('regions') }}"
when: "single_function is not defined or single_function == item.0.name"
- name: Push private=false for the function route
command: "/usr/bin/etcdctl set /route_discovery/{{item.route}}/private 'false'"
loop: "{{ gcp_functions }}"
when: "single_function is not defined or single_function == item.name"
- name: Push https protocol
command: "/usr/bin/etcdctl set /route_discovery/{{item.route}}/protocol 'https'"
loop: "{{ gcp_functions }}"
when: "single_function is not defined or single_function == item.name"
- name: Push upstreamRoute
command: "/usr/bin/etcdctl set /route_discovery/{{item.route}}/upstreamRoute '/{{item.name}}'"
loop: "{{ gcp_functions }}"
when: "single_function is not defined or single_function == item.name"
- name: Push timestamp to watched entry so nginx config is refreshed
command: "/usr/bin/etcdctl set /route_discovery/watched '$(date +%s%N)'"