calogica / dbt-expectations

Port(ish) of Great Expectations to dbt test macros

Home Page:https://calogica.github.io/dbt-expectations/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] expect_row_values_to_have_data_for_every_n_datepart can fail if model has never been run

verhey opened this issue · comments

Is this a new bug in dbt-expectations?

  • I believe this is a new bug in dbt-expectations
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

The run_query operation in expect_row_values_to_have_data_for_every_n_datepart can fail if test_start_date or test_end_date are not specified and the model dbt-expectations is testing hasn't been created yet in the target DB.

In our case, this happens at the dbt compile step in a CI build that validates the dbt project associated with a PR is able to compile.

While I don't expect tests to pass if a model doesn't exist in a DB, I do generally want compile to complete, and this blocks that.

Expected Behavior

dbt compile is able to complete when using expect_row_values_to_have_data_for_every_n_datepart even if dbt build/run hasn't been run on a model yet.

Steps To Reproduce

  1. Create a new model and don't run it.
# models/datamart/expectations_test.sql
{{config(
    schema='datamart',
    materialized='view',
) }}

select '2023-01-01'::date as foo
union all
select '2023-01-02'::date as foo
union all
select '2023-01-03'::date as foo
union all
select '2023-01-04'::date as foo
union all
select '2023-01-05'::date as foo
  1. Create a expect_row_values_to_have_data_for_every_n_datepart against that new model
# models/datamart/expectations_test.yml
version: 2

models:
  - name: expectations_test
    tests:
      - dbt_expectations.expect_row_values_to_have_data_for_every_n_datepart:
          date_col: foo
          date_part: day
  1. Try and compile the model
❯ dbt compile -m models/datamart/expectations_test.sql
00:43:58  Running with dbt=1.6.6
00:43:58  Registered adapter: snowflake=1.6.4
00:44:01  Found 729 models, 22 snapshots, 2244 tests, 392 sources, 9 exposures, 0 metrics, 791 macros, 0 groups, 0 semantic models
00:44:01
00:44:07  Concurrency: 6 threads (target='dev_snowflake')
00:44:07
00:44:08  Encountered an error:
Runtime Error
  Database Error in test dbt_expectations_expect_row_values_to_have_data_for_every_n_datepart_expectations_test_foo__day (models/datamart/expectations_test.yml)
    002003 (42S02): SQL compilation error:
    Object 'DVERHEY_DEV.DATAMART.EXPECTATIONS_TEST' does not exist or not authorized.
  1. Optional: Add a start/end date boundary and notice the error no longer repros, because the macro doesn't enter the problem code path
# models/datamart/expectations_test.yml
version: 2

models:
  - name: expectations_test
    tests:
      - dbt_expectations.expect_row_values_to_have_data_for_every_n_datepart:
          date_col: foo
          date_part: day
          test_start_date: '2023-01-01'
          test_end_date: '2023-01-05'
❯ dbt compile -m models/datamart/expectations_test.sql
00:46:43  Running with dbt=1.6.6
00:46:43  Registered adapter: snowflake=1.6.4
00:46:45  Found 729 models, 22 snapshots, 2244 tests, 392 sources, 9 exposures, 0 metrics, 791 macros, 0 groups, 0 semantic models
00:46:46
00:46:52  Concurrency: 6 threads (target='dev_snowflake')
00:46:52
00:46:53  Compiled node 'expectations_test' is:
...

Environment

- OS: MacOS 14.1
- Python: 3.9.16
- dbt: 1.6.6
- dbt-expectations: 0.10.0

Which database adapter are you using with dbt?

Snowflake

Additional Context

I don't have any great suggestions on how to fix this beyond making the date boundary required - being able to use expect_row_values_to_have_data_for_every_n_datepart without a date boundary is a nice feature.

having the same issue!

Facing the same issue, we had to use the workaround suggested here to progress.