This example demonstrates how to use Cloud Composer DAGS to:
- Restore a Postgres backup
- Extract data from Postgres to Cloud Storage (Data Lake)
- Load data from Cloud Storage (Data Lake) to BigQuery (Data Warehouse)
- Transform data on BigQuery
Resources created:
- VPC with firewall rules
- Cloud Composer v2
- Cloud SQL for Postgres
- Cloud Storage Buckets
- BigQuery datasets and tables
Check more operators available in Airflow Google Operators doc.
- Create a new project and select it
- Open Cloud Shell and ensure the env var below is set, otherwise set it with
gcloud config set project
command
echo $GOOGLE_CLOUD_PROJECT
- Create a bucket to store your project's Terraform state
gsutil mb gs://$GOOGLE_CLOUD_PROJECT-tf-state
- Enable the necessary APIs
gcloud services enable compute.googleapis.com \
container.googleapis.com \
containerregistry.googleapis.com\
composer.googleapis.com \
bigquery.googleapis.com \
storage.googleapis.com \
cloudfunctions.googleapis.com \
pubsub.googleapis.com \
sqladmin.googleapis.com
- Give permissions to Cloud Build for creating the resources
PROJECT_NUMBER=$(gcloud projects describe $GOOGLE_CLOUD_PROJECT --format='value(projectNumber)')
gcloud projects add-iam-policy-binding $GOOGLE_CLOUD_PROJECT --member=serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com --role=roles/editor
gcloud projects add-iam-policy-binding $GOOGLE_CLOUD_PROJECT --member=serviceAccount:$PROJECT_NUMBER@cloudbuild.gserviceaccount.com --role=roles/iam.securityAdmin
- Clone this repo
git clone https://github.com/sylvioneto/gcp-cloud-composer.git
cd gcp-cloud-composer
- Execute Terraform using Cloud Build
gcloud builds submit . --config cloudbuild.yaml
- Go to Cloud Composer and check out the dags
- Execute Terraform using Cloud Build
gcloud builds submit . --config cloudbuild_destroy.yaml
Create a virtual environment and install the requirements
virtualenv ven
pip install -r requirements.txt