Centralized Application Logs with the Elastic Stack

This repository gives an overview of five different logging patterns:

Parse: Take the log files of your applications and extract the relevant pieces of information.
Send: Add a log appender to send out your events directly without persisting them to a log file.
Structure: Write your events in a structured file, which you can then centralize.
Containerize: Keep track of short lived containers and configure their logging correctly.
Orchestrate: Stay on top of your logs even when services are short lived and dynamically allocated on Kubernetes.

Dependencies

Python 2 or 3 to run the Python code (but you don't need this if using the containerized app).
Docker (and Docker Compose) to run all the required components of the Elastic Stack (Filebeat, Logstash, Elasticsearch, and Kibana) and the containerized Python application.

Bring up the Elastic Stack: $ docker-compose up --build
Rerun the Python logging example application if necessary: $ docker restart <ID of the python app>
Remove the Elastic Stack (and its volumes): $ docker-compose down -v

Take a look at the code — which pattern are we building with log statements here?

Copy a log line and start parsing it with the Grok Debugger in Kibana, for example with the pattern ^\[%{TIMESTAMP_ISO8601:timestamp}\]%{SPACE}%{LOGLEVEL:level} — show https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns to get started. The rest will be done with the logstash.conf.
Point to https://github.com/elastic/ecs for the naming conventions.
Show the Data Visualizer in Machine Learning by uploading the LOG file. The output is actually quite good already, but we are sticking with our manual rules for now.
Find the log statements in Kibana's Discover view for the parse index.
Show the pipeline in Kibana's Monitoring view as well as the other components in Monitoring.
How many log events should we have? 40. But we have 42 entries instead. Even though 42 would generally be the perfect number, here it's not.
See the _grokparsefailure in the tag field. Enable the multiline rules in Filebeat. It should automatically refresh and when you run the application again it should now only collect 40 events.
Show that this is working as expected now and drill down to the errors to see which emoji we are logging.
Create a vertical bar chart visualization on the level field. Further break it down into session.

Show that the logs are missing from the first run, since no connection to Logstash had been established yet.
Rerun the application and see that it is working now. And we have already seen the main downside of this approach.
Finally, you would need to rename the fields to match ECS in a Logstash filter.

Run the application and show the data in the structure index.
Show the Logback configuration for JSON, since it is a little more complicated than the others.

Show the metadata we are collecting now.
See why the console output works here, but we should turn off the colorization (otherwise the parsing breaks).
Turn on the ingest pipeline and to show how everything is working and restart Docker Compose.
See why we needed the grok failure rule, because of the startup error from sending to Logstash directly.
Filter to the right container name and point out the hinting that stops the multiline statements from being broken up.