dbt-labs / dbt-event-logging

a dbt package to make auditing dbt runs easy.

Home Page:https://hub.getdbt.com/dbt-labs/logging/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

dbt 0.12 and custom schemas

dapearce opened this issue · comments

This package depends quite a bit on the default schema naming format (eg {{ schema }}_meta). dbt 0.12 applies custom schema naming overrides to packages (dbt-labs/dbt-core#801), which causes this package to break.

The primary issue I believe is the line below, which defines the schema name of audit log table, overriding any custom schema naming that the other models will now use:
https://github.com/fishtown-analytics/dbt-event-logging/blob/master/macros/audit.sql#L5

cc @jthandy

FYI I reconfigured our schema naming to get around this but may be impacting others

oh christ. yeah, I actually did it this way originally because that feature didn't exist. I will try to get around to fixing this. Thanks for the heads up.

I had a look at this today as it's relevant to a piece of client work that I'm doing.

The tricky part about using the custom schema behavior is that we have to know which schema the audit table lives in.
Currently:

  • dbt inserts into the relation {{ logging.get_audit_relation() }} here, e.g. jaffle_shop_dev_meta.dbt_audit_log
  • dbt figures out what object that is by using schema=target.schema~'_meta' here, in our case jaffle_shop_dev_meta.

If we instead use: schema=schema, the dbt_audit_log table gets created in the schema that is responsive to the generate_schema_name macro.

But then, when we try to insert into this table from a model that is being created in another schema, it will use the schema for the current model, leading to errors like this:

...
10:24:16 | 4 of 5 START table model jaffle_shop_dev.customers................... [RUN]
10:24:16 | 4 of 5 ERROR creating table model jaffle_shop_dev.customers.......... [ERROR in 0.36s]
...
Completed with 2 errors:

Database Error in model customers (models/customers.sql)
  relation "jaffle_shop_dev.dbt_audit_log" does not exist
  compiled SQL at target/compiled/jaffle_shop/customers.sql

I can't find a way to easily tell dbt to always the schema for the dbt_audit_log table instead. I tried something like:

    {% set audit_schema =
        ref('stg_dbt_audit_log').schema
    %}

but that resulted in:

Compilation Error in model customers (models/customers.sql)
  dbt was unable to infer all dependencies for the model "customers".
  This typically happens when ref() is placed within a conditional block.

  To fix this, add the following hint to the top of the model "customers":

  -- depends_on: {{ ref('stg_dbt_audit_log') }}

  > in macro get_audit_relation (macros/audit.sql)
  > called by macro log_audit_event (macros/audit.sql)
  > called by macro log_model_start_event (macros/audit.sql)
  > called by macro statement (macros/core.sql)
  > called by macro run_hooks (macros/materializations/helpers.sql)
  > called by macro materialization_table_default (macros/materializations/table/table.sql)
  > called by model customers (models/customers.sql)

So, is there a way to access {{ logging.schema }} in some way without introducing a whole bunch of dependencies?

@clrcrl what is the desired behavior here? Should all logs go into the same table? Or should a _meta schema with an audit table be created for every custom schema in the project.

If we want everything to go into a single table, I think we might need to hack it like this:

{% set schema = graph.nodes['model.logging.stg_dbt_audit_log']['schema'] %}

This is a pretty big hack, but it's no worse than the existing hack :p

Closing as I'm not actually sure what the bug is here. I tried to dig in a tiny bit, but didn't get anywhere:

Steps to reproduce:

  1. Add event-logging package to project, including the model pre and post hooks
  2. Configure some models to be materialized in a separate schema
  3. dbt run

Expected results:

  1. All audit records end up in the same table, <target.schema>_meta.dbt_audit_log, regardless of schema they are configured in.
  2. The event_schema is correct for each model

Actual results:

  1. As above — no bug detected