arbeiter/btc-poll

Components

Poller that invokes the btc api and hydrates the dynamo table with the bitcoin prices
- The poller writes the bitcoin prices to ddb with a conditional update
  - It avoids a full table scan during the write.
- montecarlo/producer/lambda_function.py
API Handler that serves requests
- montecarlo/lambda/api_request_handler.py
  - Metrics:
    - For gathering all metrics, the metric request handler does a full table scan
      - This could be optimized looking up the metric_id key
    - For getting a specific coin's id, the request handler only looks up a specific key and not a table scan. This is accomplished using a filter expression on the coin_id
    - For ranking the metrics, currently we do a full table scan and get all data.
      - We could store the daily average as a separate column to optimize the amount of data held in memory
Metric Gatherer with business logic to obtain the information requested by the api.
- montecarlo/lambda/metric_gatherer.py
  - get_metrics: uses a keyConditionExpression to get metrics for a given key Id
  - get_all_metrics: Full table scan
  - get_day_metrics: Filters out get_metrics results for the given day, assumes 24 hour entries
    - This should be changed to retrieve entries based on the age of insertion, which should be recorded in the writes
  - get_rank:
    - Rank needs a full table scan as well.
    - Get all metrics using get_all_metrics
    - Compute average and standard deviation
    - sort values by rank and return
Alert generation:
- Alert generation relies on the alert string being written into the logs.
- Email can then be decoupled from the poller this way.
  - We can use metric filters to search for and match terms, phrases, or values in your log events. When a metric filter finds one of the terms, phrases, or values in your log events, you can increment the value of a CloudWatch metric.
  - This metric filter can then be used to trigger a lambda that generates emails.
  - This is a fair bit of work and possibly out of scope of this POC.
Spins up a cdk stack with the data in us-east-2
- montecarlo_stack py

Scalability

Flushing the cache to dynamodb for data older than a day for longer term metric calculation would work better.
Concurrent lambda executions can get expensive. One of the many knobs of cost control we have are:
- Move the Metric Gatherer to an EC2/ECS instance and use a multi-threaded C++/Golang/Java application to split up the network requests to the BTC API and write them to the cache.

Performance Optimization

A storage optimization can be obtained by keeping a sliding window of daily metrics, by removing from the window and adding to the window every second, and also maintaining a running average with math. Eg:
- [MetricAge1,MetricAge2,......MetricAge246060]
- At Time 246060 + 1:
  - Dequeue MetricAge1 from the Window.
  - Enqueue MetricAge246060 into the Window
- This optimization is necessary in production since DDB has column width limit.
Rank computation
- Replacing Dynamodb with elasticache
- Caching and updating the daily average values in Redis
- Using the cache to compute the rank versus looking up values in the DDB to avoid costly reads
- Writing to and retrieving the daily data in Redis(ElasticCache) instead of looking it up in Dynamodb could enable faster lookups
  - Retrieval times would be faster for daily metrics

Correctness:

Perhaps the biggest flaw with this implementation right now are:

There are no alerts generated by this implementation. The most scalable way to generate alerts would be to:
- Generate alerts based off a metric filter on the Cloudwatch logs
- This metric filter would kick off a processing lambda that can send off custom alerts
If the poller lambda fails, the metric aggregation will be buggy.
- To obviate this, we definitely need a timestamp in the metrics pushed to dynamodb from the poller and use the timestamp to determine the daily window.
Hardcoded coins since it was cumbersome getting all the coin pair combinations from a given index and testing them.

Monitoring

Metrics of importance:
- API Gateway monitors on API 5xxes
  - Total API Invocation count
  - API Availability dashboards as a percentage of 1 - 5xxes/Total RequestCount
- Lambda monitors:
  - Lambda invocation errors
  - Lambda invocation counts
  - Default lambda execution metrics
  - Alarms on Lambda DLQ
- Dynamodb monitors:
  - Capacity monitors
  - RCU/WCU monitors

Testing

As the code stands, unit tests can be written for each component fairly easily using PyTest/unittest.
Integration tests can be written using a long running canary that polls the APIs and checks whether they return the results periodically.
Load testing can be simulated by setting up a test API endpoint and configuring the poller to interact with the fake API. This approach will be more useful when there are more than 3 coins as I've hardcoded them in the code.
Manual testing detailed in API endpoints section

Features implemented

The app will query data from a publicly available source at least every 1 minute (try https://docs.cryptowat.ch/rest-api/ to get cryptocurrency quotes,
The app has a REST API to enable the following user experience (you do not need to implement the user interface): obtain metrics for a given id, and its rank.
The app will log an alert whenever a metric exceeds 3x the value of its average in the last 1 hour.

API Endpoint to test with:

https://b7ccauan6k.execute-api.us-east-2.amazonaws.com/prod

API endpoints

curl https://b7ccauan6k.execute-api.us-east-2.amazonaws.com/prod
    dict_keys(['ltcusd', 'dogeusd', 'btcusd'])
    Gets available coins

curl {API_ENDPOINT}/prod/metrics/dogeusd
curl {API_ENDPOINT}/prod/rank/dogeusd
curl {API_ENDPOINT}/prod/metrics/btcusd
curl {API_ENDPOINT}/prod/rank/btcusd
curl {API_ENDPOINT}/prod/metrics/ltcusd
curl {API_ENDPOINT}/prod/rank/ltcusd

Alerts

Emits an alert of format: ("Alert " + coin_price + " " + coin_name) when price exceeds 3 * mean
Future steps:
- Set up an SES email integration based off this log message: custom meric filter -> SES to be asynchronous while handling this alert.
- The values in the db could be prefixed with the timestamp in case the hydrator for the values (poller component) crashes. Right now, we rely on those values being present and the poller never crashing.

Sample CW alerts

	2021-02-08T22:38:35.045-08:00	START RequestId: a593f728-86cc-49f2-b398-b3ed7086332a Version: $LATEST
	2021-02-08T22:38:35.493-08:00	Writing to ddb {'Attributes': {'id': 'btcusd', 'prices': ['47090', '47237.3', '47251.9', '47120', '47079', '47184', '47300', '47500', '47564.1', '47548.4']}, 'ResponseMetadata': {'RequestId': 'A0SIUKH6ULTCQAODAMCVVI125BVV4KQNSO5AEMVJF66Q9ASUAAJG', 'HTTPStatusCode': 200, 'HTTPHeaders': {'server': 'Server', 'date': 'Tue, 09 Feb 2021 06:38:35 GMT', 'content-type': 'application/x-amz-json-1.0', 'content-length': '201', 'connection': 'keep-alive', 'x-amzn-requestid': 'A0SIUKH6ULTCQAODAMCVVI125BVV4KQNSO5AEMVJF66Q9ASUAAJG', 'x-amz-crc32': '3805190591'}, 'RetryAttempts': 0}}
	2021-02-08T22:38:35.493-08:00	generating alerts
	2021-02-08T22:38:35.766-08:00	GET METRICS ['47090', '47237.3', '47251.9', '47120', '47079', '47184', '47300', '47500', '47564.1', '47548.4']
	2021-02-08T22:38:35.766-08:00	day_metrics ['47090', '47237.3', '47251.9', '47120', '47079', '47184', '47300', '47500', '47564.1', '47548.4']
	2021-02-08T22:38:35.766-08:00	['47090', '47237.3', '47251.9', '47120', '47079', '47184', '47300', '47500', '47564.1', '47548.4']
	2021-02-08T22:38:35.766-08:00	Alert 47548.4 btcusd
	2021-02-08T22:38:36.013-08:00	Writing to ddb {'Attributes': {'id': 'ltcusd', 'prices': ['170.84', '170.84', '170.73', '170.43', '170.02', '170.43', '170.91', '171.4', '171.54', '171.2']}, 'ResponseMetadata': {'RequestId': 'DI700BOGV8MRBF3J15T7KRIDENVV4KQNSO5AEMVJF66Q9ASUAAJG', 'HTTPStatusCode': 200, 'HTTPHeaders': {'server': 'Server', 'date': 'Tue, 09 Feb 2021 06:38:35 GMT', 'content-type': 'application/x-amz-json-1.0', 'content-length': '201', 'connection': 'keep-alive', 'x-amzn-requestid': 'DI700BOGV8MRBF3J15T7KRIDENVV4KQNSO5AEMVJF66Q9ASUAAJG', 'x-amz-crc32': '603431769'}, 'RetryAttempts': 0}}
	2021-02-08T22:38:36.013-08:00	generating alerts
	2021-02-08T22:38:36.273-08:00	GET METRICS ['170.84', '170.84', '170.73', '170.43', '170.02', '170.43', '170.91', '171.4', '171.54', '171.2']
	2021-02-08T22:38:36.273-08:00	day_metrics ['170.84', '170.84', '170.73', '170.43', '170.02', '170.43', '170.91', '171.4', '171.54', '171.2']
	2021-02-08T22:38:36.273-08:00	['170.84', '170.84', '170.73', '170.43', '170.02', '170.43', '170.91', '171.4', '171.54', '171.2']
	2021-02-08T22:38:36.292-08:00	Alert 171.2 ltcusd
	2021-02-08T22:38:36.513-08:00	Writing to ddb {'Attributes': {'id': 'dogeusd', 'prices': ['0.0806613', '0.0806', '0.0804033', '0.0800001', '0.0799713', '0.0798435', '0.079769', '0.0796531', '0.0800735', '0.0798693']}, 'ResponseMetadata': {'RequestId': 'TNFC2HT0089SMN7RC8UFNT39EFVV4KQNSO5AEMVJF66Q9ASUAAJG', 'HTTPStatusCode': 200, 'HTTPHeaders': {'server': 'Server', 'date': 'Tue, 09 Feb 2021 06:38:36 GMT', 'content-type': 'application/x-amz-json-1.0', 'content-length': '230', 'connection': 'keep-alive', 'x-amzn-requestid': 'TNFC2HT0089SMN7RC8UFNT39EFVV4KQNSO5AEMVJF66Q9ASUAAJG', 'x-amz-crc32': '881094376'}, 'RetryAttempts': 0}}
	2021-02-08T22:38:36.513-08:00	generating alerts
	2021-02-08T22:38:36.768-08:00	GET METRICS ['0.0806613', '0.0806', '0.0804033', '0.0800001', '0.0799713', '0.0798435', '0.079769', '0.0796531', '0.0800735', '0.0798693']
	2021-02-08T22:38:36.768-08:00	day_metrics ['0.0806613', '0.0806', '0.0804033', '0.0800001', '0.0799713', '0.0798435', '0.079769', '0.0796531', '0.0800735', '0.0798693']
	2021-02-08T22:38:36.768-08:00	['0.0806613', '0.0806', '0.0804033', '0.0800001', '0.0799713', '0.0798435', '0.079769', '0.0796531', '0.0800735', '0.0798693']
	2021-02-08T22:38:36.772-08:00	No alerts0.0798693 dogeusd
	2021-02-08T22:38:36.774-08:00	END RequestId: a593f728-86cc-49f2-b398-b3ed7086332a
	2021-02-08T22:38:36.774-08:00	REPORT RequestId: a593f728-86cc-49f2-b398-b3ed7086332a Duration: 1729.22 ms Billed Duration: 1730 ms Memory Size: 128 MB Max Memory Used: 77 MB Init Duration: 330.53 ms

CDK Details

To manually create a virtualenv on MacOS and Linux:

$ python3 -m venv .venv

After the init process completes and the virtualenv is created, you can use the following step to activate your virtualenv.

$ source .venv/bin/activate

If you are a Windows platform, you would activate the virtualenv like this:

% .venv\Scripts\activate.bat

Once the virtualenv is activated, you can install the required dependencies.

$ pip install -r requirements.txt

At this point you can now synthesize the CloudFormation template for this code.

$ cdk synth

To add additional dependencies, for example other CDK libraries, just add them to your setup.py file and rerun the pip install -r requirements.txt command.

source aws creds

cdk diff; cdk deploy

arbeiter / btc-poll