mozilla / bigquery-etl

Bigquery ETL

Home Page:https://mozilla.github.io/bigquery-etl

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Remove usage of `referenced_tables` in metadata.yaml

scholtzan opened this issue · comments

We used to use dryrun to get all tables a query references. This caused problems for queries that were referencing tables dryrun didn't have permissions to access. So we added referenced_tables to use the explicitly referenced tables instead of doing a dryrun and also to speed up the process for queries referencing main_v4.
We did make a change to use sqlglot to determine table dependencies. So it should be safe to remove referenced_tables and ignore them when generating Airflow DAGs.

┆Issue is synchronized with this Jira Task

Should we also remove referencedTables from the dry run cloud function and related utils too?

There are still a few edge cases where referenced_tables are used afaik.