mycaule / bigquery-to-pubsub

A tool for streaming time series data from a BigQuery table to a Pub/Sub topic

Home Page:https://medium.com/google-cloud/how-to-replay-time-series-data-from-google-bigquery-to-pub-sub-c0a80095124b

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

bigquery-to-pubsub

A tool for streaming time series data from a BigQuery table to Pub/Sub

  1. Create a Service Account with the following roles:

    • BigQuery Admin
    • Storage Admin
    • Pub/Sub Publisher
  2. Create a key file for the Service Account and download it as credentials_file.json.

  3. Create a Pub/Sub topic called bigquery-to-pubsub-test0.

  4. Create a temporary GCS bucket and a temporary BigQuery dataset:

> bash create_temp_resources.sh
  1. Run replay for Ethereum transactions:
> docker build -t bigquery-to-pubsub:latest -f Dockerfile .
> project=$(gcloud config get-value project 2> /dev/null)
> temp_resource_name=$(./get_temp_resource_name.sh)
> echo "Replaying Ethereum transactions"
> docker run \
    -v /path_to_credentials_file/:/bigquery-to-pubsub/ --env GOOGLE_APPLICATION_CREDENTIALS=/bigquery-to-pubsub/credentials_file.json \
    bigquery-to-pubsub:latest \
    --bigquery-table bigquery-public-data.crypto_ethereum.transactions \
    --timestamp-field block_timestamp \
    --start-timestamp 2019-10-23T00:00:00 \
    --end-timestamp 2019-10-23T01:00:00 \
    --batch-size-in-seconds 1800 \
    --replay-rate 0.1 \
    --pubsub-topic projects/${project}/topics/bigquery-to-pubsub-test0 \
    --temp-bigquery-dataset ${temp_resource_name} \
    --temp-bucket ${temp_resource_name}

About

A tool for streaming time series data from a BigQuery table to a Pub/Sub topic

https://medium.com/google-cloud/how-to-replay-time-series-data-from-google-bigquery-to-pub-sub-c0a80095124b

License:MIT License


Languages

Language:Python 85.1%Language:Shell 10.5%Language:Smarty 3.2%Language:Dockerfile 1.1%