dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Home Page:https://getdbt.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug] Unit tests fail when input contain reserved-word-named columns

mpatek opened this issue · comments

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

In BigQuery:

Supposing an input table with a reserved-word-named column like:

-- stg
SELECT id, loaded_at, `from` FROM {{ source('some', source') }}

And a downstream table like:

-- some_model
SELECT id, loaded_at FROM {{ ref('stg') }}
qualify row_number() over (partition by id order by loaded_at desc) = 1

And a unit test like:

version: 2

unit_tests:
  - name: test__some_model
    model: some_model
    given:
      - input: ref('stg')
        rows:
          - {id: "a", loaded_at: "2024-01-01"}
          - {id: "a", loaded_at: "2024-01-02"}
          - {id: "a", loaded_at: "2024-01-02"}
    expect:
      rows:
        - {id: "a", loaded_at: "2024-01-02"}

If I try dbt test --select test_type:unit

I get: Syntax error: Unexpected keyword FROM

Expected Behavior

Expect tests to pass without errors.

Steps To Reproduce

  1. Create input model with keyword-named column (e.g. from)
  2. Create output model that selects from the input model (not necessarily including the reserved-word-named column).
  3. Add unit test that puts data into input model and runs expectations on output model
  4. Run unit test

Relevant log output

No response

Environment

- OS: macOS Monterey
- Python: 3.11.7
- dbt: 1.8.0-rc1 w/ bigquery 1.8.0b2

Which database adapter are you using with dbt?

bigquery

Additional Context

No response

Thanks for reporting this @mpatek !

When I ran your example, I got the same error as you.

When I looked at the compiled SQL (in target/compiled/my_project/models/_unit.yml/models/test__some_model.sql for me because my dbt project name in dbt_project.yml is "my_project"), it looked like this:

with __dbt__cte__stg as (

-- Fixture for stg
select safe_cast('''a''' as STRING) as id, safe_cast('''2024-01-01''' as DATETIME) as loaded_at, safe_cast(null as INT64) as from
union all
select safe_cast('''a''' as STRING) as id, safe_cast('''2024-01-02''' as DATETIME) as loaded_at, safe_cast(null as INT64) as from
union all
select safe_cast('''a''' as STRING) as id, safe_cast('''2024-01-02''' as DATETIME) as loaded_at, safe_cast(null as INT64) as from
) -- some_model
SELECT id, loaded_at FROM __dbt__cte__stg
qualify row_number() over (partition by id order by loaded_at desc) = 1

And when I copy that compiled output to an analyses file, I get the same error:

cp target/compiled/my_project/models/_unit.yml/models/test__some_model.sql analyses             
dbt show -s analyses/test__some_model.sql 

But if I modify that analysis file to have backticks around all the column names, then it can run successfully:

with __dbt__cte__stg as (

-- Fixture for stg
select safe_cast('''a''' as STRING) as `id`, safe_cast('''2024-01-01''' as DATETIME) as `loaded_at`, safe_cast(null as INT64) as `from`
union all
select safe_cast('''a''' as STRING) as `id`, safe_cast('''2024-01-02''' as DATETIME) as `loaded_at`, safe_cast(null as INT64) as `from`
union all
select safe_cast('''a''' as STRING) as `id`, safe_cast('''2024-01-02''' as DATETIME) as `loaded_at`, safe_cast(null as INT64) as `from`
) -- some_model
SELECT id, loaded_at FROM __dbt__cte__stg
qualify row_number() over (partition by id order by loaded_at desc) = 1

Output

21:52:49  Previewing node 'test__some_model':
| id |           loaded_at |
| -- | ------------------- |
| a  | 2024-01-02 00:00:00 |

So I'm wondering if we just need to apply quoted in just the right place.

@dbeatty10 I think you're right, and I think the right spots might be here and here:

{%- for column_name, column_value in ... %} {{ column_value }} as {{ adapter.quote(column_name) }}{% if not loop.last -%},{%- endif %}

Minimal reprex

``

models/my_model.sql

select 1 as {{ adapter.quote("from") }}

models/_unit.yml

unit_tests:

  - name: test___my___model
    model: my_model
    given: []
    expect:
      rows:
        - {from: 1}

Run the model and its tests:

dbt build -s my_model

Implementation prototype

We'd probably choose to actually implement it differently than this, but here is a prototype that worked with the reprex above:

Closing, since this should proabably be a dbt-adapter issue: dbt-labs/dbt-adapters#205