HTTPArchive / bigquery

BigQuery import and processing pipelines

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Monthly batch jobs hosted under older user directory

tunetheweb opened this issue · comments

The monthly batch jobs are still hosted in /home/igrigorik/code and the cron runs under that user (though connecting to a different BigQuery user).

Additional this is currently not authenticated to GCP:

Looks like @rviscomi it was using your account last but I obviously can't re-authenticate that:

igrigorik@worker:~/code$ bq show "httparchive:pages.2021_12_01_desktop"
ERROR: (bq) Your current active account [rviscomi@google.com] does not have any valid credentials

We need to fix the BigQuery authentication issue before the January run finishes in the next month or it won't process the pipeline nor ruin the reports as they are under this cron.

Longer term we should probably also moved these out of the /home/igrigorik/code directory and cron to a generic account on the server (create a httparchive user?) ideally with an equivalent BigQuery account it can use that won't expire.

Done (@tunetheweb did it)