terraform-google-bigquery

Terraform Module to create Google Cloud Platform BigQuery datasets and tables. This will allow the user to programmatically create an empty table schema inside of a dataset, ready for loading. Additional user accounts and permissions are necessary to begin querying the newly created table(s).

Resources

Example Usage

// cloudbuild api is required
resource "google_project_service" "bigquery_service" {
  project = local.project_id
  service = "bigquery.googleapis.com"
}

module "bigquery" {
  source  = "app.terraform.io/Seagen/bigquery/google"
  version = "5.2.0"

  dataset_id   = google_bigquery_dataset.dataset.dataset_id
  dataset_name = "nyt-covid-dataset"
  description  = google_bigquery_dataset.dataset.description
  project_id   = local.project_id
  location     = google_bigquery_dataset.dataset.location

  tables = [
    {
      table_id           = google_bigquery_table.table.table_id,
      schema             = file("bigquery/nyt_covid_count_by_state_schema.json"),
      time_partitioning  = null,
      range_partitioning = null,
      expiration_time    = null,
      clustering         = [],
      labels             = local.labels,
    }
  ]
  dataset_labels = local.labels
}

//Creating a Big Query Dataset Resource
resource "google_bigquery_dataset" "dataset" {
  dataset_id  = "nyt_covid_dataset"
  description = "New York Times Covid Dataset"
  location    = "US"
}

//Creating a Big Query Table Resource
resource "google_bigquery_table" "table" {
  deletion_protection = false
  dataset_id          = google_bigquery_dataset.dataset.dataset_id
  table_id            = "nyt_covid_count_by_state"
}

Depending on the schema of your example you will need to create a folder within your repo called 'bigquery'. Inside this folder add the example_table_schema.json file such as the one below:

[
    {
      "description": "Date",
      "mode": "NULLABLE",
      "name": "date",
      "type": "DATE"
    },
    {
      "description": "Name of State",
      "mode": "NULLABLE",
      "name": "state_name",
      "type": "STRING"
    },
    {
      "description": "State Identifier",
      "mode": "NULLABLE",
      "name": "state_fips_code",
      "type": "INTEGER"
    },
    {
      "description": "Confirmed Number of Cases",
      "mode": "NULLABLE",
      "name": "confirmed_cases",
      "type": "INTEGER"
    },
    {
      "description": "Number of Deaths",
      "mode": "NULLABLE",
      "name": "deaths",
      "type": "INTEGER"
    }
]

Features

This module provisions a dataset and a list of tables with associated JSON schemas and views from queries.

Inputs

Name	Description	Type	Default	Required
access	An array of objects that define dataset access for one or more entities.	`any`	[ { "role": "roles/bigquery.dataOwner", "special_group": "projectOwners" } ]	no
dataset_id	Unique ID for the dataset being provisioned.	`string`	n/a	yes
dataset_labels	Key value pairs in a map for dataset labels	`map(string)`	`{}`	no
dataset_name	Friendly name for the dataset being provisioned.	`string`	`null`	no
default_table_expiration_ms	TTL of tables using the dataset in MS	`number`	`null`	no
delete_contents_on_destroy	(Optional) If set to true, delete all the tables in the dataset when destroying the resource; otherwise, destroying the resource will fail if tables are present.	`bool`	`null`	no
deletion_protection	Whether or not to allow Terraform to destroy the instance. Unless this field is set to false in Terraform state, a terraform destroy or terraform apply that would delete the instance will fail	`bool`	`false`	no
description	Dataset description.	`string`	`null`	no
encryption_key	Default encryption key to apply to the dataset. Defaults to null (Google-managed).	`string`	`null`	no
external_tables	A list of objects which include table_id, expiration_time, external_data_configuration, and labels.	list(object({ table_id = string, autodetect = bool, compression = string, ignore_unknown_values = bool, max_bad_records = number, schema = string, source_format = string, source_uris = list(string), csv_options = object({ quote = string, allow_jagged_rows = bool, allow_quoted_newlines = bool, encoding = string, field_delimiter = string, skip_leading_rows = number, }), google_sheets_options = object({ range = string, skip_leading_rows = number, }), hive_partitioning_options = object({ mode = string, source_uri_prefix = string, }), expiration_time = string, labels = map(string), }))	`[]`	no
location	The regional location for the dataset only US and EU are allowed in module	`string`	`"US"`	no
project_id	Project where the dataset and table are created	`string`	n/a	yes
routines	A list of objects which include routine_id, routine_type, routine_language, definition_body, return_type, routine_description and arguments.	list(object({ routine_id = string, routine_type = string, language = string, definition_body = string, return_type = string, description = string, arguments = list(object({ name = string, data_type = string, argument_kind = string, mode = string, })), }))	`[]`	no
tables	A list of objects which include table_id, schema, clustering, time_partitioning, range_partitioning, expiration_time and labels.	list(object({ table_id = string, schema = string, clustering = list(string), time_partitioning = object({ expiration_ms = string, field = string, type = string, require_partition_filter = bool, }), range_partitioning = object({ field = string, range = object({ start = string, end = string, interval = string, }), }), expiration_time = string, labels = map(string), }))	`[]`	no
views	A list of objects which include table_id, which is view id, and view query	list(object({ view_id = string, query = string, use_legacy_sql = bool, labels = map(string), }))	`[]`	no

Outputs

Name	Description
bigquery_dataset	Bigquery dataset resource.
bigquery_external_tables	Map of BigQuery external table resources being provisioned.
bigquery_tables	Map of bigquery table resources being provisioned.
bigquery_views	Map of bigquery view resources being provisioned.
external_table_ids	Unique IDs for any external tables being provisioned
external_table_names	Friendly names for any external tables being provisioned
project	Project where the dataset and tables are created
routine_ids	Unique IDs for any routine being provisioned
table_ids	Unique id for the table being provisioned
table_names	Friendly name for the table being provisioned
view_ids	Unique id for the view being provisioned
view_names	friendlyname for the view being provisioned

About

This module allows you to create opinionated Google Cloud Platform BigQuery datasets and tables.

https://registry.terraform.io/modules/terraform-google-modules/bigquery/google

Apache License 2.0

Languages

Language:HCL 69.3%Language:Shell 13.7%Language:Ruby 10.8%Language:Makefile 6.2%