dpguthrie / dbt-multi-tenant-example

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Schema-Based Multitenancy

This project is meant to showcase options for data modeling where each customer has their own individual schema within a database

Update (9/29/22) - Another option was added that the loom video below and code within this repo does not cover.

Loom

Watch a ~15 minute video detailing the options below: https://www.loom.com/share/0461b3473e34495586fbeed54671425e

Requirements

The dbt_utils package is used in this project. Specifically:

Options

Each of the options have a couple shared requirements:

  • union_source macro will be used to create staging models that contain data for all customers
  • customer_model macro will be used to select from the "master" table (contains all customer data) and filter down to the appropriate customer.

Option 1 (Explicit But Hard to Maintain)

This option requires that you explicitly create the models you'd like to replicate for each customer. Some basic math:

  • 5 dim/fct models to replicate for each customer
  • 100 customers
  • 500 models you'll need to create
  • Each model will have to be prefixed with the customer (cust_1_dim_customers.sql, cust_2_dim_customers.sql, ..., cust_n_dim_customers.sql)

The benefit here is that we have explicitly defined our models and they'll be included in our lineage graph. The downside of this approach is that it is not scalable and becomes very hard to maintain as more customers are added.

Option 2 (Programmatic But Not Explicit)

This option requires that we create those same customer-specific fct/dim models through the use of a macro. The macro will loop through each customer schema and defined set of models and create tables by explicitly defining DDL. We'll lose the lineage here but will be much more scalable as both customers and models to share increase.

The job for this option should be defined with at least the following commands (and in this order):

  1. dbt run
  2. dbt run-operation create_customer_tables

The first command will create the tables containing all customer data and the second command will run the macro to create customer-specific models.

Option 3 (Best of Both Worlds)

This option would leverage a package called dbt-dynamic-models.

About