weaveworks / scope

Monitoring, visualisation & management for Docker & Kubernetes

Home Page:https://www.weave.works/oss/scope/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use S3 for historical queries instead of DynamoDB

bboreham opened this issue · comments

Scope has an optional multitenant mode, where reports are saved to S3 and indexed in DynamoDB.

Once #3783 is done, live rendering will not use the store, so we will have far less time pressure. I think we can drop the index and just use an S3 'list' API call to find objects.

However we will need to change the object path-name to include the time as a prefix.
Current paths are like s3://bucket-name/00002140a76ed46df4956c4af4004160/1554123600273225527, where the first part is a MD5 hash of the tenant ID and hour number, and the second part is the Unix timestamp in nanoseconds.

Steps to complete:

  1. change S3 object pathname so the prefix is tenant/date/hour (or maybe finer-grained).
  2. change querier to list reports within a prefix time-bucket using S3 rather than DynamoDB.
  3. add switch-over date so querier uses DynamoDB index before that and S3 list after.
  4. stop collectors writing to DynamoDB.