holyshared / dataflow-with-tasks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

dataflow-with-tasks

Setup

brew update
brew install pyenv
pyenv install 3.8.11
pyenv virtualenv 3.8.11 dataflow
pyenv local dataflow

Setup for project

apache-beam

pip install --upgrade virtualenv
python -m virtualenv env
source env/bin/activate
pip install --quiet apache-beam[gcp]

GCP

terraform init
terraform apply

Run dataflow enqueue task

GOOGLE_APPLICATION_CREDENTIALS=[CREDENTIALS_FILE] \
python -m \
  example.cloud_tasks \
  --project \
  [PROJECT_ID] \
  --runner DataflowRunner \
  --temp_location \
  gs://[BUCKET_NAME]/temp \
  --output \
  [QUEUE_NAME] \
  --job_name dataflow-intro \
  --region asia-northeast1 \
  --service_account_email [SERVICE_ACCOUNT_EMAIL] \
  --subnetwork=https://www.googleapis.com/compute/v1/projects/[PROJECT_ID]/regions/asia-northeast1/subnetworks/[NETWORK_NAME]
  --requirements_file ./requirements.txt

Code format

pip install autopep8
autopep8 --in-place --aggressive --aggressive --ignore=E111,E114 --indent-size=2 --recursive example

About

License:MIT License


Languages

Language:HCL 47.9%Language:Python 31.0%Language:TypeScript 17.7%Language:Dockerfile 3.4%