Goahnary / syrinx-demo

Local airflow deployment of an ETL process uploading encrypted data to GCS and BQ.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to run

This code was developed using a local deployment of airflow because I had some issues with GCP that blocked me from continuing development there.

In order to run locally you will need to install the below packages using Pip:

  • apache-airflow-providers-google
  • cryptography

after you've installed the relevant packages you will need to create a service account to access GCS and Big Query.

These are the roles I have for the service account I created in my GCP account:

  • BigQuery Data Editor
  • BigQuery Job User
  • Storage Object Admin

Then you will want to create the following resources:

  • a GCS bucket named: user-data-etl-demo
  • a BQ dataset named: syrinx
  • a table in the dataset named: users

If you cannot get it running or would rather me just demo the code for you I am happy to do it for you over a video call!

Please let me know if you have any questions. My email is noah@noahgary.io 🙂

About

Local airflow deployment of an ETL process uploading encrypted data to GCS and BQ.


Languages

Language:Python 100.0%