CAVaccineInventory / vial

The Django application powering calltheshots.us

Home Page:https://vial.calltheshots.us

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

/api/exportVaccinateTheStates failing in production

simonw opened this issue · comments

Refs #705. The endpoint is showing "failed" messages when called using Cloud Scheduler.

Jobs_–_Cloud_Scheduler_–_vial_–_Google_Cloud_Platform

The staging job - running the same export code, but against a smaller set of data - works OK.

Nothing in Sentry. It looks like it's been failing for quite a while - definitely since I deployed the latest code.

Logs_Explorer_–_Logging_–_vial_–_Google_Cloud_Platform

I tried running it manually like so:

~ % curl -XPOST 'https://vial.calltheshots.us/api/exportVaccinateTheStates' --header "Authorization: Bearer 27:b5d06ef7..." -d ''
{"ok": 1}

That worked! https://api.vaccinatethestates.com/ shows a whole bunch of files with a last modification date around 2021-07-08T23:23:53.299Z - which is 9 minutes ago in UTC.

Looking at the logs from the most recent two attempts:

Banners_and_Alerts_and_Logs_Explorer_–_Logging_–_vial_–_Google_Cloud_Platform

I'm suspicious that it looks like exactly 3 minutes occurred between that initial log line (presumably corresponding to the start of the hit) and the log line with the error. That suggests to me that there's a Cloud Scheduler timeout of some sort here.

https://stackoverflow.com/a/66298820/6083 suggests you can set a job "deadline" from the CLI like this:

gcloud beta scheduler jobs update http <job> --attempt-deadline=1800s --project <project>

Looks like that option isn't available in the web console interface at https://console.cloud.google.com/cloudscheduler/jobs/edit/us-west2/vaccinatethestates-api-export-production?project=django-vaccinateca

I have CLI access on my laptop:

~ % gcloud beta scheduler jobs list --project django-vaccinateca
ID                                        LOCATION  SCHEDULE (TZ)                                                    TARGET_TYPE  STATE
api-export-production                     us-west2  every 1 minutes (America/Los_Angeles)                            HTTP         ENABLED
api-export-staging                        us-west2  every 1 minutes (America/Los_Angeles)                            HTTP         ENABLED
mapbox-export                             us-west2  0 2,9,10,11,12,13,14,15,16,17,18,21 * * * (America/Los_Angeles)  HTTP         ENABLED
resolve-missing-counties-production       us-west2  */10 * * * * (America/Los_Angeles)                               HTTP         ENABLED
resolve-missing-counties-staging          us-west2  */10 * * * * (America/Los_Angeles)                               HTTP         ENABLED
vaccinatethestates-api-export-production  us-west2  */10 * * * * (America/Los_Angeles)                               HTTP         ENABLED
vaccinatethestates-api-export-staging     us-west2  */10 * * * * (America/Los_Angeles)                               HTTP         ENABLED

Confirmed: the current deadline is 180s:

~ % gcloud beta scheduler jobs describe vaccinatethestates-api-export-production --project django-vaccinateca
attemptDeadline: 180s
description: Hit /api/exportVaccinateTheStates to export to api.vaccinatethestates.com
  bucket
httpTarget:
  headers:
    Authorization: Bearer 27:b5d06ef7bfa1650267ed9750228c5e93
    User-Agent: Google-Cloud-Scheduler
  httpMethod: POST
  uri: https://vial.calltheshots.us/api/exportVaccinateTheStates
lastAttemptTime: '2021-07-08T23:40:00.992830Z'
name: projects/django-vaccinateca/locations/us-west2/jobs/vaccinatethestates-api-export-production
retryConfig:
  maxBackoffDuration: 3600s
  maxDoublings: 5
  maxRetryDuration: 0s
  minBackoffDuration: 5s
schedule: '*/10 * * * *'
scheduleTime: '2021-07-08T23:50:00.061563Z'
state: ENABLED
status:
  code: 2
timeZone: America/Los_Angeles
userUpdateTime: '2021-06-10T23:34:48Z'

I'm going to set the deadline to 9 minutes (since the cron runs every 10 minutes) - which is 540s.

gcloud beta scheduler jobs update http vaccinatethestates-api-export-production --attempt-deadline=540s --project django-vaccinateca

Oops, just accidentally shared an API key in the above comment ^ - I'll cancel that and issue a new one now.

API key deleted:
27_b5d06ef7______Change_api_key___VIAL_admin

That fixed it - the scheduled task ran successfully.

I'm going to bump up the time limit on the staging job too:

gcloud beta scheduler jobs update http vaccinatethestates-api-export-staging --attempt-deadline=540s --project django-vaccinateca