peel / s3-events

Repository from Github https://github.compeel/s3-eventsRepository from Github https://github.compeel/s3-events

S3 to Snowplow

Use case

Send notifications whenever data input is larger than estimated processor throughput.

Concept

Snowplow ops1 pipeline can consume events for:

  • s3 incoming data (via lambda s3 notifications)
  • data processor size (via mt-configs2 repository or via bands.yml file, gh webhook discouraged due to possible data vanishing between commits) The data is then put into ES where it is cross-referenced. The idea is to leverage ops1 pipeline by pushing (Josh suggested go lambda) S3 Notifications (object key + byte size) as Snowplow events so they end up in ES.

Running

Required components:

  • Snowplow (ie. micro)
  • AWS S3, Lambda (ie. Localstack w/ SERVICES=serverless,lambda,s3)
  1. Fire up with docker:
TMPDIR=/private$TMPDIR docker-compose up
  1. Deploy to lambda & s3 bucket
aws --endpoint-url http://localhost:4574 lambda create-function \
  --function-name s3-events --runtime go1.x \
  --zip-file fileb://function.zip --handler main \
  --role arn:aws:iam::123456:role/irrelevant \
  --environment Variables="{EVENT_SCHEMA='TBD', COLLECTOR_URI='TBD'}"

aws --endpoint-url=http://localhost:4572 s3 mb my-bucket
  1. Setup notifications
aws --endpoint-url=http://localhost:4572 s3api put-bucket-notification-configuration --bucket my-bucket --notification-configuration file://config/notification.json
  1. Put anything to s3
touch any.txt
aws --endpoint-url=http://localhost:4572 s3 cp any.txt  s3://my-bucket/any.txt
  1. Get an event in micro

Links

About


Languages

Language:Nix 71.9%Language:Go 28.1%