ScalefreeCOM / datavault4dbt

Scalefree's dbt package for a Data Vault 2.0 implementation congruent to the original Data Vault 2.0 definition by Dan Linstedt including the Staging Area, DV2.0 main entities, PITs and Snapshot Tables.

Home Page:https://www.scalefree.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

dbt deps creates an error in dbt cloud

universe-designer opened this issue · comments

I tried to recreate the code from the demo session. I get the following error when I run "dbt deps"

Server log summary

Compilation Error in model stg_order (models/staging/datavaultdemo/stg_order.sql)
  The name argument to ref() must be a string, got <class 'dbt.clients.jinja.create_undefined.<locals>.Undefined'>
  
  > in macro snowflake__stage (macros/staging/snowflake/stage.sql)
  > called by macro stage (macros/staging/stage.sql)
  > called by model stg_order (models/staging/datavaultdemo/stg_order.sql)
Compilation Error in model stg_order (models/staging/datavaultdemo/stg_order.sql)
  The name argument to ref() must be a string, got <class 'dbt.clients.jinja.create_undefined.<locals>.Undefined'>
  
  > in macro snowflake__stage (macros/staging/snowflake/stage.sql)
  > called by macro stage (macros/staging/stage.sql)
  > called by model stg_order (models/staging/datavaultdemo/stg_order.sql)

Server log Details

16:02:29  Set downloads directory='/tmp/dbt-downloads-4ifolly8'
16:02:29  Making package index registry request: GET https://hub.getdbt.com/api/v1/index.json
16:02:29  Response from registry index: GET https://hub.getdbt.com/api/v1/index.json 200
16:02:29  Making package registry request: GET https://hub.getdbt.com/api/v1/dbt-labs/dbt_utils.json
16:02:29  Response from registry: GET https://hub.getdbt.com/api/v1/dbt-labs/dbt_utils.json 200
16:02:29  Making package registry request: GET https://hub.getdbt.com/api/v1/ScalefreeCOM/datavault4dbt.json
16:02:29  Response from registry: GET https://hub.getdbt.com/api/v1/ScalefreeCOM/datavault4dbt.json 200
16:02:29  Installing dbt-labs/dbt_utils
16:02:35    Installed from version 0.9.2
16:02:35    Up to date!
16:02:35  Sending event: {'category': 'dbt', 'action': 'package', 'label': '797fd050-690d-4e6b-a0b8-e889d588b98f', 'property_': 'install', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f865004caf0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f865004cc10>]}
16:02:35  Installing ScalefreeCOM/datavault4dbt
16:02:46    Installed from version 1.0.5
16:02:46    Up to date!
16:02:46  Sending event: {'category': 'dbt', 'action': 'package', 'label': '797fd050-690d-4e6b-a0b8-e889d588b98f', 'property_': 'install', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f865004c940>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f865004cac0>]}

My files are

packages.yml

packages:
  - package: dbt-labs/dbt_utils
    version: 0.9.2
  - package:  ScalefreeCOM/datavault4dbt 
    version: 1.0.5

models/staging/datavaultdemo/sources.yml


version: 2

sources: 
  - name: TPC-H_SF1
    database: SNOWFLAKE_SAMPLE_DATA
    schema: TPCH_SF1
    tables:
      - name: customer
  - name: deltas
    database: DATAVAULT4DBT_DEMO
    schema: CORE_SOURCE_SCHEMA
    tables: 
      - name: orders_initial
      - name: orders_1993
      - name: orders_1994
      - name: orders_1995
      - name: orders_1996
      - name: orders_1997
      - name: orders_1998

models/staging/datavaultdemo/stg_order.yml

    {{ config(materialized='view', schema='CORE_SOURCE_SCHEMA') }}

    {%- set yaml_metadata -%}
    source_model: 
          'deltas': 'orders_initial'
    hashed_columns: 
        hk_h_orders:
            - o_orderkey
        hk_h_customers:
            - customer_name
        hk_l_orders_customers:
            - o_orderkey
            - customer_name
        hd_orders_n_s:
            is_hashdiff: true
            columns:
                - o_orderstatus
                - o_totalprice
                - o_orderdate
                - o_orderpriority
                - o_clerk
                - o_shippriority
                - o_comment
                - is_highest_priority
                - description
                - legacy_orderkey
    derived_columns:
        is_highest_priority:
            value: "CASE WHEN (o_orderpriority = '1 - URGENT') THEN true ELSE false END"
            datatype: "BOOLEAN"
            src_cols_required: 'o_orderpriority'
        description:
            value: '!Orders from TPC_H, reference to Customers'
            datatype: 'STRING'
    missing_columns:
        legacy_orderkey: 'STRING'
    prejoined_columns:
        customer_name:
            src_name: 'TPC-H_SF1'
            src_table: 'customer'
            bk: 'c_name'
            this_column_name: 'o_custkey'
            ref_column_name: 'c_custkey'
    ldts: 'edwLoadDate'
    rsrc: '!TPC_H_SF1.Orders'
    {%- endset -%}

    {%- set metadata_dict = fromyaml(yaml_metadata) -%}

    {%- set source_model = metadata_dict['source_model'] -%}
    {%- set ldts = metadata_dict['ldts'] -%}
    {%- set rsrc = metadata_dict['rsrc'] -%}
    {%- set hashed_columns = metadata_dict['hashed_columns'] -%}
    {%- set derived_columns = metadata_dict['derived_columns'] -%}
    {%- set prejoined_columns = metadata_dict['prejoined_columns'] -%}
    {%- set missing_columns = metadata_dict['missing_columns'] -%}

    {{ datavault4dbt.stage(source_model=source_model,
                        ldts=ldts,
                        rsrc=rsrc,
                        hashed_columns=hashed_columns,
                        derived_columns=derived_columns,
                        prejoined_columns=prejoined_columns,
                        missing_columns=missing_columns) }}

Did I make a mistake or is there an issue with dbt cloud? I tried restarting the IDE multiple times

Hi @universe-designer and thanks for reaching out to us!

My files are
...

models/staging/datavaultdemo/stg_order.yml

    {{ config(materialized='view', schema='CORE_SOURCE_SCHEMA') }}

    {%- set yaml_metadata -%}
    source_model: 
          'deltas': 'orders_initial'
    hashed_columns: 
        hk_h_orders:
            - o_orderkey
        hk_h_customers:
            - customer_name
        hk_l_orders_customers:
            - o_orderkey
            - customer_name
        hd_orders_n_s:
            is_hashdiff: true
            columns:
                - o_orderstatus
                - o_totalprice
                - o_orderdate
                - o_orderpriority
                - o_clerk
                - o_shippriority
                - o_comment
                - is_highest_priority
                - description
                - legacy_orderkey
    derived_columns:
        is_highest_priority:
            value: "CASE WHEN (o_orderpriority = '1 - URGENT') THEN true ELSE false END"
            datatype: "BOOLEAN"
            src_cols_required: 'o_orderpriority'
        description:
            value: '!Orders from TPC_H, reference to Customers'
            datatype: 'STRING'
    missing_columns:
        legacy_orderkey: 'STRING'
    prejoined_columns:
        customer_name:
            src_name: 'TPC-H_SF1'
            src_table: 'customer'
            bk: 'c_name'
            this_column_name: 'o_custkey'
            ref_column_name: 'c_custkey'
    ldts: 'edwLoadDate'
    rsrc: '!TPC_H_SF1.Orders'
    {%- endset -%}

    {%- set metadata_dict = fromyaml(yaml_metadata) -%}

    {%- set source_model = metadata_dict['source_model'] -%}
    {%- set ldts = metadata_dict['ldts'] -%}
    {%- set rsrc = metadata_dict['rsrc'] -%}
    {%- set hashed_columns = metadata_dict['hashed_columns'] -%}
    {%- set derived_columns = metadata_dict['derived_columns'] -%}
    {%- set prejoined_columns = metadata_dict['prejoined_columns'] -%}
    {%- set missing_columns = metadata_dict['missing_columns'] -%}

    {{ datavault4dbt.stage(source_model=source_model,
                        ldts=ldts,
                        rsrc=rsrc,
                        hashed_columns=hashed_columns,
                        derived_columns=derived_columns,
                        prejoined_columns=prejoined_columns,
                        missing_columns=missing_columns) }}

Did I make a mistake or is there an issue with dbt cloud? I tried restarting the IDE multiple times

I have two remarks to your staging model:

  • When copying and pasting your model into visual studio code, all lines are indented too far. It might be a Github format issue, but please verify that the lines containing "config", the "set" calls for variable definition, and the "datavault4dbt.stage" call are all on the very beginning of each line, without any trailing spaces.
  • Besides that, it seems that there is one blank space before "deltas": "orders_initial" too much. It should be on the same indentation as "hk_h_orders"

Besides those two findings, your model looks absolutely fine! Unfortunately, due to the yaml characteristics, the slightest indentation problems lead to a dbt error message.

Let me know if this solves your issue!

Hello @tkirschke thank you very much for your quick response! Indentation was the issue. It worked now thanks!