Revere

Disclaimer

Revere runs Python entered via a webpage. It currently makes no attempt to sandbox this code. Always run Revere as a non-privledged user and ensure you have authentication set up.

This project was inspired by LivingSocial's Rearview. We have been using it without issue for serveral months, but it is far from stable.

There is optional Google Apps OAuth authentication built in, but due to the nature of this application, it is still hightly recommended that you secure Revere behind your firewall.

Terms

source - a source of data. A database, graphite server, or 3rd party monitoring API
alert - a way of alerting you when certain criteria are met. Campfire, AWS SNS, email, text, etc.
monitor - a script that runs on a schedule and pulls data from one or more sources. A monitor can indicate it is in an Alarm state, in which case the alerts will be fired.

Features

Pluggable sources of data
Pluggable alerts to notify you
Write your moniors using Python and specify the schedule using crontab syntax
Store the return value of the monitor (numbers or strings) for each run
Automatic purging of old data (day granularity)

Revere is a general purpose monitoring and alerting system. It has pluggable sources of data and alerts. So you can pull data from anywhere you want and then trigger alarms when certain thresholds are crossed, all while using pure python for your calculations.

Installation

pip install git+git://github.com/pegler/revere.git

//create a config file.  defaults will be used if missing from the file
touch config.py

//create the SQLite database
revereserver.py init

//run Revere. defaults to port 5000
revereserver.py run

Configuration

Revere uses a python file named config.py in the current working directory. The configuration variables are:

DATABASE_PATH - the path to the SQLite file
REVERE_SOURCES - a dict specifying the sources. The key can be anything and is used by the monitors to access the source. The value is a configuration dict for the source.
REVERE_ALERTS - a dict specifying the alerts
GOOGLE_APPS_DOMAIN - if specified, Google Apps OAuth Authentication will be enabled and enforced on all views. The domain specified will be the only domain permitted access. This requires specifying a SECRET_KEY in your config file as well.

Example config.py file:

DATABASE_PATH = 'revere.db'

SECRET_KEY = 'something random and secret'

#GOOGLE_APPS_DOMAIN = 'example.com' # optional

REVERE_SOURCES = {
    'graphite': {
        'description': 'Graphite Server',
        'type': 'revere.sources.graphite.GraphiteSource',
        'config': {
            'url': 'http://dashing.example.com/render',
            'auth_username': 'username',
            'auth_password': 'password',
        }
    },
    'mysql': {
        'description': 'Local MySQL Database',
        'type': 'revere.sources.database.DatabaseSource',
        'config': {
            'connection_string': 'mysql://readonlyuser:password@localhost/production',
        }
    }
}

REVERE_ALERTS = {
    'campfire-engineering': {
        'description': 'Post a message to Campfire - Engineering',
        'type': 'revere.alerts.campfire.CampfireAlert',
        'config': {
            'api_token': 'xxxxxx',
            'subdomain': 'example',
            'room_id': '123456',
        }
    },
    'operations-sns': {
        'description': 'Publish a message to AWS SNS Topic operations',
        'type': 'revere.alerts.sns.SNSAlert',
        'config': {
            'region': 'us-east-1',
            'topic_arn': 'xxxxx',
            'access_key_id': 'xxxxx',
            'secret_key': 'xxxxx',
        }
    }
}

Monitors

Monitors are configured using simple Python. Simply navigate to the "Create Monitor" page, specify the schedule using crontab syntax, specify the retention period, and then write the Python that does the checking. The script is executed with a dictionary named sources in scope that has the various sources configured available. The keys are the same as specified in the configuration file.

If the monitor has "failed" and should be in the ALARM state, the code should raise a MonitorFailure exception. The message passed into the exception will be included in any alerts triggered from the ALARM state.

Any other exception raised will be change the monitor to the ERROR state and trigger any enabled alerts.

Any data assigned to the variable return_value will be recorded. The data must be an int, float, long, string, or unicode.

An example monitor:

total_requests = sources['dashing'].get_sum('sum(stats_counts.response.*)','-10min')
error_requests = sources['dashing'].get_sum('stats_counts.response.500','-10min')
error_percentage = error_requests/total_requests
return_value = error_percentage

if error_percentage >= .005:
    raise MonitorFailure('High number of error responses. %s%%' % (error_percentage))

Alerts will often include the return value, message passed into MonitorFailure, and the current state of the monitor.

Sources

revere.sources.graphite.GraphiteSource

Pull data from a Graphite server.

Configuration

Parameters:

url - the url of the graphite server
auth_username (optional) - username for basic authentication
auth_password (optional) - password for basic authentication

Usage

It has 3 methods, all with identical parameters.

path - the dotted path for the data to retreive. Graphite functions can be passed in.
from_date - any valid graphite starting time. example: '-5d'
to_date - any valid graphite starting time. example: '-2d'

Methods:

get_datapoints(path, from_date=None, to_date=None) - return a list of (value, timestamp) pairs for the path within the given timeframe
get_sum(path, from_date=None, to_date=None) - return the sum of the values. null values are counted as 0
get_avg(path, from_date=None, to_date=None) - return the average of the values. null values are counted as 0

revere.sources.database.DatabaseSource

Connect to any database. It uses SQLAlchemy for connections, which supports most databases.

Configuration

Parameters:

connection_string - the SQLAlchemy connection string to the database. See: http://pythonhosted.org/Flask-SQLAlchemy/config.html#connection-uri-format
pool_recycle (default: 3600) - number of seconds before a connection in the pool should be recycled

Usage

The only method is execute(sql, as_dict=False) which accepts raw SQL and returns either a list of tuples. If as_dict is True, it will return a list of dicts keyed on the column names.

Alerts

Alerts can be configured to only fire when a monitor transitions to a particular state. So you can get a phone call when a monitor is in the ALARM state, but only get an email when it goes back to the OK state.

revere.alerts.campfire.CampfireAlert

Send a message to a Campfire room of the form:

[Revere Alarm]
Monitor: Mail Queue Length
State Change: ALARM -> OK
Message: Monitor Passed
Return Value: 67

Configuration

api_token - the API token for the user to send the message as
room_id - the id for the room to post to. Find this in the URL of the room
subdomain - the subdomain for the room to post to. Find this in the URL of the room

revere.alerts.sns.SNSAlert

Send a message to an Amazon Web Services' Simple Notification Service (AWS SNS) Topic. It will include a subject and body for emails as well as a shortened message to be sent to SMS subscribers.

Configuration

topic_arn - the topic ARN to post to. Of the form: arn:aws:sns:us-east-1:1234567890:topic-name
access_key_id - the API Access Key ID to post to the topic
secret_key - the API Secret Key to post to the topic

Screenshots

A list of monitors with their current state and time since last run

The overview page for a monitor. It lists the past state changes including the return value from the monitor and alarm message.

Full history for a monitor

The list of alerts and which states they get triggered for.

Thanks

This project is mostly just cobbling together several other excellent projects.

Flask - the web front-end
SQLAlchemy - excellent lightweight database wrapper
APScheduler - managing the schedule for the monitors
Tornado - lightweight web server
Google Federated Logins for Flask - Google Apps Authentication

pegler / revere

Revere

Disclaimer

Terms

Features

Installation

Configuration

Monitors

Sources

revere.sources.graphite.GraphiteSource

Configuration

Usage

revere.sources.database.DatabaseSource

Configuration

Usage

Alerts

revere.alerts.campfire.CampfireAlert

Configuration

revere.alerts.sns.SNSAlert

Configuration

Screenshots

Thanks

About

Languages