OBSERVABILITY WORKSHOP
OVERVIEW
The files found in this repo comprise all the required artifacts to run a sample app and demonstrate the a few metrics' scraping scenarios. The sample app mfence
is configured using the original fence
app as inspiration, i.e., a single container running both Nginx (web server) and uWSGI (app server) to host a Python application that operates with Flask Blueprints.
All these 3 logical layers (nginx, uwsgi & python app) carry configuration changes and the addition of new HTTP endpoints to expose metrics, such as:
- Nginx: Number of active connections and HTTP I/O (Reading, Writing and Waiting) =
http://localhost:6565/nginx_status
. - uWSGI: Workers' activity, status, data transmission, errors =
http://localhost:6565/uwsgi_status
- mfence's
/metrics
endpoint: This blueprint script produces fictitious metrics. The fake data illustrates what sort of application-specific metrics could be exposed to give service-owners more visibility on interesting events =http://localhost:6565/metrics/
.
The following diagram illustrates how the metrics are obtained:
Other details to talk about
- Poetry
- Flask / Blueprints
- Dockerfile (and dynamic config changes in
dockerrun.sh
)
Application metrics
Examine the src/mfence/blueprints/metrics.py
script.
Running local test
brew install nginx
brew install uwsgi
# copy over the nginx.conf to /usr/local/etc/nginx/nginx.conf and set a diff port (e.g., 6567)
brew services restart nginx
uwsgi --socket 127.0.0.1:3031 --ini uwsgi-local-run.ini
# now just try: http://localhost:6567/
CHECK SOME STATS
pip install uwsgitop
uwsgitop http://127.0.0.1:9191
BUILDING THE DOCKER IMAGE
docker build --tag mfence:try1 .
START YOUR APP
docker run -it --rm -v $(pwd)/logs/:/var/log/nginx/ -p 6565:80 --name mfence mfence:try1
START THE NGINX METRICS EXPORTER
docker run -p 9113:9113 --rm --link mfence:mfence --name nginx-prometheus-exporter nginx/nginx-prometheus-exporter:0.8.0 -nginx.scrape-uri http://mfence/nginx_status
START THE UWSGI METRICS EXPORTER
docker run --rm -it -p 9117:9117 --link mfence:mfence --name uwsgi-exporter timonwong/uwsgi-exporter --stats.uri http://mfence/uwsgi_status
START THE NGINX LOGS METRICS EXPORTER
docker run --rm --name nginx-logs-exporter -v $(pwd)/logs:/mnt/nginxlogs/ -p 4040:4040 quay.io/martinhelmich/prometheus-nginxlog-exporter mnt/nginxlogs/access_not_json.log
START PROMETHEUS CONTAINER
debug mode enabled in case you have any metrics scraping issues
docker run --rm -it -p 9090:9090 -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml --link mfence:mfence --link nginx-prometheus-exporter:nginx-prometheus-exporter --link uwsgi-exporter:uwsgi-exporter --link nginx-logs-exporter:nginx-logs-exporter --name myprometheus --entrypoint sh prom/prometheus -c "/bin/prometheus --config.file=/etc/prometheus/prometheus.yml --storage.tsdb.path=/prometheus --web.console.libraries=/usr/share/prometheus/console_libraries --web.console.templates=/usr/share/prometheus/consoles --log.level=debug"
TESTING
Check the following URLs:
# your app metrics
http://localhost:6565/metrics/
# your nginx stats info (not in prometheus-scraping format)
http://localhost:6565/nginx_status
# your uwsgi stats info (not in prometheus-scraping format)
http://localhost:6565/uwsgi_status
# your nginx metrics
http://localhost:9113/metrics
# your uwsgi metrics
http://localhost:9117/metrics
# your nginx logs metrics
http://localhost:4040/metrics
# your Prometheus console
http://localhost:9090/
START GRAFANA CONTAINER
docker run -it --rm --link prometheus:prometheus --name grafana -p 3001:3000 grafana/grafana:7.1.1
STOP ALL CONTAINERS AND TRY DOCKER-COMPOSE
# start all
docker-compose up -d
# stop all
docker-compose down
DOCKER CLEANUP COMMANDS
docker stop $(docker ps -a -q)
docker rm $(docker ps -a -q)
docker rmi $(docker images | grep "<none>" | awk '{ print $3}')
Some queries
Just a few prometheus queries to play around with the metrics:
avg(cirrus_retries)
rate(uwsgi_worker_transmitted_bytes_total[5m])
rate(nginx_http_response_count_total{status="200", method="GET"}[5m])
# TODO: Add more interesting queries
# e.g., "success rate ((num_failures / total_reqs)*100 )
Screenshots
How it looks like once you put it all together: TODO: Add screenshots