Seagen / terraform-google-bigquery

This module allows you to create opinionated Google Cloud Platform BigQuery datasets and tables.

Home Page:https://registry.terraform.io/modules/terraform-google-modules/bigquery/google

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

terraform-google-bigquery

Terraform Module to create Google Cloud Platform BigQuery datasets and tables. This will allow the user to programmatically create an empty table schema inside of a dataset, ready for loading. Additional user accounts and permissions are necessary to begin querying the newly created table(s).

Resources

Example Usage

// cloudbuild api is required
resource "google_project_service" "bigquery_service" {
  project = local.project_id
  service = "bigquery.googleapis.com"
}

module "bigquery" {
  source  = "app.terraform.io/Seagen/bigquery/google"
  version = "5.2.0"

  dataset_id   = google_bigquery_dataset.dataset.dataset_id
  dataset_name = "nyt-covid-dataset"
  description  = google_bigquery_dataset.dataset.description
  project_id   = local.project_id
  location     = google_bigquery_dataset.dataset.location

  tables = [
    {
      table_id           = google_bigquery_table.table.table_id,
      schema             = file("bigquery/nyt_covid_count_by_state_schema.json"),
      time_partitioning  = null,
      range_partitioning = null,
      expiration_time    = null,
      clustering         = [],
      labels             = local.labels,
    }
  ]
  dataset_labels = local.labels
}

//Creating a Big Query Dataset Resource
resource "google_bigquery_dataset" "dataset" {
  dataset_id  = "nyt_covid_dataset"
  description = "New York Times Covid Dataset"
  location    = "US"
}

//Creating a Big Query Table Resource
resource "google_bigquery_table" "table" {
  deletion_protection = false
  dataset_id          = google_bigquery_dataset.dataset.dataset_id
  table_id            = "nyt_covid_count_by_state"
}

Depending on the schema of your example you will need to create a folder within your repo called 'bigquery'. Inside this folder add the example_table_schema.json file such as the one below:

[
    {
      "description": "Date",
      "mode": "NULLABLE",
      "name": "date",
      "type": "DATE"
    },
    {
      "description": "Name of State",
      "mode": "NULLABLE",
      "name": "state_name",
      "type": "STRING"
    },
    {
      "description": "State Identifier",
      "mode": "NULLABLE",
      "name": "state_fips_code",
      "type": "INTEGER"
    },
    {
      "description": "Confirmed Number of Cases",
      "mode": "NULLABLE",
      "name": "confirmed_cases",
      "type": "INTEGER"
    },
    {
      "description": "Number of Deaths",
      "mode": "NULLABLE",
      "name": "deaths",
      "type": "INTEGER"
    }
]

Features

This module provisions a dataset and a list of tables with associated JSON schemas and views from queries.

Inputs

Name Description Type Default Required
access An array of objects that define dataset access for one or more entities. any
[
{
"role": "roles/bigquery.dataOwner",
"special_group": "projectOwners"
}
]
no
dataset_id Unique ID for the dataset being provisioned. string n/a yes
dataset_labels Key value pairs in a map for dataset labels map(string) {} no
dataset_name Friendly name for the dataset being provisioned. string null no
default_table_expiration_ms TTL of tables using the dataset in MS number null no
delete_contents_on_destroy (Optional) If set to true, delete all the tables in the dataset when destroying the resource; otherwise, destroying the resource will fail if tables are present. bool null no
deletion_protection Whether or not to allow Terraform to destroy the instance. Unless this field is set to false in Terraform state, a terraform destroy or terraform apply that would delete the instance will fail bool false no
description Dataset description. string null no
encryption_key Default encryption key to apply to the dataset. Defaults to null (Google-managed). string null no
external_tables A list of objects which include table_id, expiration_time, external_data_configuration, and labels.
list(object({
table_id = string,
autodetect = bool,
compression = string,
ignore_unknown_values = bool,
max_bad_records = number,
schema = string,
source_format = string,
source_uris = list(string),
csv_options = object({
quote = string,
allow_jagged_rows = bool,
allow_quoted_newlines = bool,
encoding = string,
field_delimiter = string,
skip_leading_rows = number,
}),
google_sheets_options = object({
range = string,
skip_leading_rows = number,
}),
hive_partitioning_options = object({
mode = string,
source_uri_prefix = string,
}),
expiration_time = string,
labels = map(string),
}))
[] no
location The regional location for the dataset only US and EU are allowed in module string "US" no
project_id Project where the dataset and table are created string n/a yes
routines A list of objects which include routine_id, routine_type, routine_language, definition_body, return_type, routine_description and arguments.
list(object({
routine_id = string,
routine_type = string,
language = string,
definition_body = string,
return_type = string,
description = string,
arguments = list(object({
name = string,
data_type = string,
argument_kind = string,
mode = string,
})),
}))
[] no
tables A list of objects which include table_id, schema, clustering, time_partitioning, range_partitioning, expiration_time and labels.
list(object({
table_id = string,
schema = string,
clustering = list(string),
time_partitioning = object({
expiration_ms = string,
field = string,
type = string,
require_partition_filter = bool,
}),
range_partitioning = object({
field = string,
range = object({
start = string,
end = string,
interval = string,
}),
}),
expiration_time = string,
labels = map(string),
}))
[] no
views A list of objects which include table_id, which is view id, and view query
list(object({
view_id = string,
query = string,
use_legacy_sql = bool,
labels = map(string),
}))
[] no

Outputs

Name Description
bigquery_dataset Bigquery dataset resource.
bigquery_external_tables Map of BigQuery external table resources being provisioned.
bigquery_tables Map of bigquery table resources being provisioned.
bigquery_views Map of bigquery view resources being provisioned.
external_table_ids Unique IDs for any external tables being provisioned
external_table_names Friendly names for any external tables being provisioned
project Project where the dataset and tables are created
routine_ids Unique IDs for any routine being provisioned
table_ids Unique id for the table being provisioned
table_names Friendly name for the table being provisioned
view_ids Unique id for the view being provisioned
view_names friendlyname for the view being provisioned

About

This module allows you to create opinionated Google Cloud Platform BigQuery datasets and tables.

https://registry.terraform.io/modules/terraform-google-modules/bigquery/google

License:Apache License 2.0


Languages

Language:HCL 69.3%Language:Shell 13.7%Language:Ruby 10.8%Language:Makefile 6.2%