dbt-content / google-datacatalog-dbt-tag

Update a Google Data Catalog tag with dbt Cloud run metadata

Home Page:https://www.getdbt.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Google Cloud Data Catalog and dbt

Example to create or update a Google Cloud Data Catalog tag on BigQuery tables or views with dbt Cloud metadata via a Python Cloud Function.

Data Catalog tag : dbt Run Metadata tag attached to the BigQuery table or view and containing information from the dbt Run used to create or update the BigQuery table or view : Run durations and date, dbt Project and Model, Cloud job, Cloud project and approximative size and rows count.

To activate, learn and use Cloud Data Catalog, go to https://cloud.google.com/data-catalog and https://console.cloud.google.com/datacatalog.

This repository contains the Cloud Function Python code to create or update the Data Catalog tag.

This Cloud Function uses:

In your Cloud Function, you need the 5 files:

Before runing the Cloud Function (and create or update tags), you need to create the Data Catalog Tag Template for dbt Run Metadata.

You can use:

To use the Cloud Function you just have to pass the dbt Cloud Run ID and the dbt Cloud Account ID in a JSON format like {"dbt_run_id":"13161733","dbt_account_id":"11442"}.

When the Data Catalog template tag is created and when a tag is created or updated on BigQuery tables or views, you can find all results from https://console.cloud.google.com/datacatalog.

Finally, you can also search BigQuery tables or views in Cloud Data Catalog with a dbt tag from your own application like https://github.com/dbt-content/dbt-datacatalog-explorer


Happy tagging !


image

About

Update a Google Data Catalog tag with dbt Cloud run metadata

https://www.getdbt.com


Languages

Language:Python 87.2%Language:Shell 12.8%