log-monitor
HTTP log monitor - console application. Program uses queues and Producer-Consumer pattern to process the logs, keep stats and handle alerting.
Dependencies
- pipenv - to manage virtualenvs and dependencies
- pygtail - reads log file lines that have not been read and stores offset.
- python-dateutil - parsing timestamps.
- watchdog - used for system agnostic file watching.
- blessed - for a nice terminal UI.
Installation
Required pipenv.org. After installation just run pipenv install
(pipenv install --dev
if you want to run tests). It should install all dependencies (including python 3.6).
Running
pipevn shell # activate virtualenv
rm offest # remove the offset file if exists
LOG_FILE=apache.log LOG_FILE_PATH=. python app.py # without the LOG_FILE and LOG_FILE_PATH /var/log/access.log will be used
# see other variables in app.py for customization.
Testing
To run UTs use py.test
For manual testing:
echo '64.242.88.10 - - [07/Mar/2018:16:56:50 -0800] "GET /twiki/bin HTTP/1.1" 200 8545' >> apache.log
Future improvments
- create a library from pygtail and watchdog (remove extra complexity)
- the program should run as a daemon in production
- configuration file
- better dashboard
- Gevent and multiprocessing (lower memory footprint and possible better scaling vs process isolation) Important if we want to support much more complex alerting logic and huge volume of data.
- asyncio - worth exploring especially if we need make multiple IO, not just a single file.
- timezone changes
- rate-limiting/throttling
- integration tests and more UTs
- store stats and alerts in a persistent storage and send them to a remote server e.g. with an HTTP call