mallzee / redshift

Redshift package for dbt (getdbt.com)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Redshift data models and utilities

dbt models for Redshift warehouses.

Models

This package provides a number of base models for Redshift system tables, as well as a few utility views that usefully combine the base models.

Base Models

Each of these base models maps 1-to-1 with the underlying Redshift table. Some renaming has been performed to make the field names grokable.

Ephemeral Models

These ephemeral models simplify some of Redshift's field naming and logic, to make the data more usable.

  • redshift_cost: transforms the start and max explain cost values from stl_explain into floating point values

View Models

These views are designed to make debugging your Redshift cluster more straightforward. They are, in effect, materializations of the Diagnostic Queries for Query Tuning from Redshift's documentation.

  • redshift_admin_queries: Simplified view of queries, including explain cost, execution times, and queue times.
  • redshift_admin_table_stats: Gives insight on tables in your warehouse. Includes information on sort and dist keys, table size on disk, and more.
  • redshift_admin_dependencies: Simplified view of pg_depend, showing any dependent objects (views) for a given source object

These views are designed to make user privilege management more straightforward.

Introspection Models

These models (default ephemeral) make it possible to inspect tables, columns, constraints, and sort/dist keys of the Redshift cluster. These models are used to build column compression queries, but may also be generally useful.

Macros

compress_table (source)

This macro returns the SQL required to auto-compress a table using the results of an analyze compression query. All comments, constraints, keys, and indexes are copied to the newly compressed table by this macro. Additionally, sort and dist keys can be provided to override the settings from the source table. By default, a backup table is made which is not deleted. To delete this backup table after a successful copy, use drop_backup flag.

Macro signature:

{{ compress_table(schema, table,
                  drop_backup=False,
                  comprows=none|Integer,
                  sort_style=none|compound|interleaved,
                  sort_keys=none|List<String>,
                  dist_style=none|all|even,
                  dist_key=none|String) }}

Example usage:

{{
  config({
    "materialized":"table",
    "sort": "id",
    "dist": "id",
    "post-hook": [
      "{{ redshift.compress_table(this.schema, this.table, drop_backup=False) }}"
    ]
  })
}}

unload_table (source)

This macro returns the SQL required to unload a Redshift table to one or more files on S3. The macro replicates all functionality provided by Redshift's UNLOAD command.

Macro signature:

{{ unload_table(schema,
                table,
                s3_path,
                iam_role=None|String,
                aws_key=None|String,
                aws_secret=None|String,
                manifest=Boolean,
                delimiter=String,
                null_as=String,
                max_file_size=String,
                escape=Boolean,
                compression=None|GZIP|BZIP2,
                add_quotes=Boolean,
                encrypted=Boolean,
                overwrite=Boolean,
                parallel=Boolean) }}

Example usage:

{{
  config({
    "materialized":"table",
    "sort": "id",
    "dist": "id",
    "post-hook": [
      "{{ redshift.unload_table(this.schema,
                                this.table,
                                s3_path='s3://bucket/folder',
                                aws_key='abcdef',
                                aws_secret='ghijklm',
                                delimiter='|') }}"
    ]
  })
}}

About

Redshift package for dbt (getdbt.com)

License:Apache License 2.0


Languages

Language:TSQL 82.5%Language:PLpgSQL 17.2%Language:Makefile 0.3%