Kinto / kinto-signer

Digital signatures to guarantee integrity and authenticity of collections of records.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

kinto-signer does not invalidate cache as expected

bqbn opened this issue · comments

commented

What happened was that they approved 2 records in staging/plugins collection this morning per [1], at 15:49 and 15:54 UTC respectively. And from the logs, we can see that kinto-signer issued 2 cache invalidation calls to cloudfront.

Sep 12 15:49:13 ip-172-31-2-43 docker-kinto[2622]: {"Timestamp": 1536767353229344512, "Type": "botocore.vendored.requests.packages.urllib3.connectionpool", "Logger": "kinto", "Hostname": "ip-172-31-2-43.us-west-2.compute.internal", "EnvVersion": "2.0", "Severity": 6, "Pid": 9, "Fields": {"msg": "Starting new HTTP connection (1): 169.254.169.254"}}
Sep 12 15:49:13 ip-172-31-2-43 docker-kinto[2622]: {"Timestamp": 1536767353232492288, "Type": "botocore.vendored.requests.packages.urllib3.connectionpool", "Logger": "kinto", "Hostname": "ip-172-31-2-43.us-west-2.compute.internal", "EnvVersion": "2.0", "Severity": 6, "Pid": 9, "Fields": {"msg": "Starting new HTTP connection (1): 169.254.169.254"}}
Sep 12 15:49:13 ip-172-31-2-43 docker-kinto[2622]: {"Timestamp": 1536767353265446912, "Type": "botocore.vendored.requests.packages.urllib3.connectionpool", "Logger": "kinto", "Hostname": "ip-172-31-2-43.us-west-2.compute.internal", "EnvVersion": "2.0", "Severity": 6, "Pid": 9, "Fields": {"msg": "Starting new HTTPS connection (1): cloudfront.amazonaws.com"}}
Sep 12 15:49:13 ip-172-31-2-43 docker-kinto[2622]: {"Timestamp": 1536767353805437440, "Type": "kinto_signer.updater", "Logger": "kinto", "Hostname": "ip-172-31-2-43.us-west-2.compute.internal", "EnvVersion": "2.0", "Severity": 6, "Pid": 9, "Fields": {"msg": "Invalidated CloudFront cache at /v1/*"}}
... ...
Sep 12 15:54:46 ip-172-31-2-43 docker-kinto[2622]: {"Timestamp": 1536767686217423360, "Type": "botocore.vendored.requests.packages.urllib3.connectionpool", "Logger": "kinto", "Hostname": "ip-172-31-2-43.us-west-2.compute.internal", "EnvVersion": "2.0", "Severity": 6, "Pid": 8, "Fields": {"msg": "Starting new HTTP connection (1): 169.254.169.254"}}
Sep 12 15:54:46 ip-172-31-2-43 docker-kinto[2622]: {"Timestamp": 1536767686220657408, "Type": "botocore.vendored.requests.packages.urllib3.connectionpool", "Logger": "kinto", "Hostname": "ip-172-31-2-43.us-west-2.compute.internal", "EnvVersion": "2.0", "Severity": 6, "Pid": 8, "Fields": {"msg": "Starting new HTTP connection (1): 169.254.169.254"}}
Sep 12 15:54:46 ip-172-31-2-43 docker-kinto[2622]: {"Timestamp": 1536767686249065216, "Type": "botocore.vendored.requests.packages.urllib3.connectionpool", "Logger": "kinto", "Hostname": "ip-172-31-2-43.us-west-2.compute.internal", "EnvVersion": "2.0", "Severity": 6, "Pid": 8, "Fields": {"msg": "Starting new HTTPS connection (1): cloudfront.amazonaws.com"}}
Sep 12 15:54:46 ip-172-31-2-43 docker-kinto[2622]: {"Timestamp": 1536767686713996288, "Type": "kinto_signer.updater", "Logger": "kinto", "Hostname": "ip-172-31-2-43.us-west-2.compute.internal", "EnvVersion": "2.0", "Severity": 6, "Pid": 8, "Fields": {"msg": "Invalidated CloudFront cache at /v1/*"}}

We checked the cloudfront logs, and saw that the 2 invalidation requests were received by cloudfront and completed successfully.

However, a while later after that, we got paged because the kinto-lambda-validate_signature lambda failed. The log shows that the lambda failed because of the plugins collection.

Signature verification failed on /buckets/blocklists/collections/plugins
- Signed on: 1536767686204 (2018-09-12 15:54:46 UTC)
- Records timestamp: 1524073129197 (2018-04-18 17:38:49 UTC): ValidationError
Traceback (most recent call last):
File "/var/task/aws_lambda.py", line 144, in validate_signature
raise ValidationError("\n" + "\n\n".join(error_messages))
aws_lambda.ValidationError:
Signature verification failed on /buckets/blocklists/collections/plugins
- Signed on: 1536767686204 (2018-09-12 15:54:46 UTC)
- Records timestamp: 1524073129197 (2018-04-18 17:38:49 UTC)

So we had to manually run kinto-lambda-refresh_signature lambda to clear all caches, and then the validate_signature lambda went back to normal.

It looks like whatever the kinto-signer tried to invalidate did not work as expected (otherwise the validate_signature lambda would not have complained).

One difference we can see when running kinto-lambda-refresh_signature lambda is that it sends 15 invalidation requests (may be corresponding to the following collections),

Looking at /buckets/monitor/collections/changes: 
Looking at /buckets/main-workspace/collections/focus-experiments: Refresh signature: status= signed at 2018-09-12 16:42:25 UTC ( 1536770545952 )
Looking at /buckets/staging/collections/plugins: Refresh signature: status= signed at 2018-09-12 16:42:27 UTC ( 1536770547395 )
Looking at /buckets/main-workspace/collections/cfr: status= None at 2018-09-12 15:49:10 UTC ( 1536767350764 )
Looking at /buckets/staging/collections/addons: Refresh signature: status= signed at 2018-09-12 16:42:29 UTC ( 1536770549216 )
Looking at /buckets/main-workspace/collections/onboarding: status= None at 2018-09-12 15:49:10 UTC ( 1536767350796 )
Looking at /buckets/main-workspace/collections/rocket-prefs: Refresh signature: status= signed at 2018-09-12 16:42:31 UTC ( 1536770551397 )
Looking at /buckets/staging/collections/certificates: Refresh signature: status= signed at 2018-09-12 16:42:32 UTC ( 1536770552836 )
Looking at /buckets/main-workspace/collections/tippytop: Refresh signature: status= signed at 2018-09-12 16:42:34 UTC ( 1536770554729 )
Looking at /buckets/security-state-staging/collections/cert-revocations: status= None at 2018-09-12 15:49:12 UTC ( 1536767352455 )
Looking at /buckets/security-state-staging/collections/intermediates: status= None at 2018-09-12 15:49:12 UTC ( 1536767352420 )
Looking at /buckets/staging/collections/qa: Refresh signature: status= signed at 2018-09-12 16:42:37 UTC ( 1536770557033 )
Looking at /buckets/pinning-staging/collections/pins: Refresh signature: status= signed at 2018-09-12 16:42:38 UTC ( 1536770558521 )
Looking at /buckets/staging/collections/gfx: Refresh signature: status= signed at 2018-09-12 16:42:40 UTC ( 1536770560323 )

But the logs didn't tell us which collections the 2 invalidation requests the kinto-signer sent were for.

Also, this issue is somewhat related to mozilla-services/remote-settings-lambdas#46, in which we didn't know what made validate_signature lambda to fail periodically, but it may be because of this. For example, when someone signed something, the signer didn't clear the cache correctly.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1490097

commented

One more piece of info, someone just approved a few records in main-workspace/onboarding and it didn't cause trouble to the validate_signature lambda.

I'm not sure if the investigation should be focused on the staging and blocklists buckets or not as they are the old buckets we have, whereas main-workspace is new.

We don't invalidate cache since #1256