[Bug] Unit tests fail when input contain reserved-word-named columns
mpatek opened this issue · comments
Is this a new bug in dbt-core?
- I believe this is a new bug in dbt-core
- I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
In BigQuery:
Supposing an input table with a reserved-word-named column like:
-- stg
SELECT id, loaded_at, `from` FROM {{ source('some', source') }}
And a downstream table like:
-- some_model
SELECT id, loaded_at FROM {{ ref('stg') }}
qualify row_number() over (partition by id order by loaded_at desc) = 1
And a unit test like:
version: 2
unit_tests:
- name: test__some_model
model: some_model
given:
- input: ref('stg')
rows:
- {id: "a", loaded_at: "2024-01-01"}
- {id: "a", loaded_at: "2024-01-02"}
- {id: "a", loaded_at: "2024-01-02"}
expect:
rows:
- {id: "a", loaded_at: "2024-01-02"}
If I try dbt test --select test_type:unit
I get: Syntax error: Unexpected keyword FROM
Expected Behavior
Expect tests to pass without errors.
Steps To Reproduce
- Create input model with keyword-named column (e.g.
from
) - Create output model that selects from the input model (not necessarily including the reserved-word-named column).
- Add unit test that puts data into input model and runs expectations on output model
- Run unit test
Relevant log output
No response
Environment
- OS: macOS Monterey
- Python: 3.11.7
- dbt: 1.8.0-rc1 w/ bigquery 1.8.0b2
Which database adapter are you using with dbt?
bigquery
Additional Context
No response
Thanks for reporting this @mpatek !
When I ran your example, I got the same error as you.
When I looked at the compiled SQL (in target/compiled/my_project/models/_unit.yml/models/test__some_model.sql
for me because my dbt project name in dbt_project.yml
is "my_project"), it looked like this:
with __dbt__cte__stg as (
-- Fixture for stg
select safe_cast('''a''' as STRING) as id, safe_cast('''2024-01-01''' as DATETIME) as loaded_at, safe_cast(null as INT64) as from
union all
select safe_cast('''a''' as STRING) as id, safe_cast('''2024-01-02''' as DATETIME) as loaded_at, safe_cast(null as INT64) as from
union all
select safe_cast('''a''' as STRING) as id, safe_cast('''2024-01-02''' as DATETIME) as loaded_at, safe_cast(null as INT64) as from
) -- some_model
SELECT id, loaded_at FROM __dbt__cte__stg
qualify row_number() over (partition by id order by loaded_at desc) = 1
And when I copy that compiled output to an analyses file, I get the same error:
cp target/compiled/my_project/models/_unit.yml/models/test__some_model.sql analyses
dbt show -s analyses/test__some_model.sql
But if I modify that analysis file to have backticks around all the column names, then it can run successfully:
with __dbt__cte__stg as (
-- Fixture for stg
select safe_cast('''a''' as STRING) as `id`, safe_cast('''2024-01-01''' as DATETIME) as `loaded_at`, safe_cast(null as INT64) as `from`
union all
select safe_cast('''a''' as STRING) as `id`, safe_cast('''2024-01-02''' as DATETIME) as `loaded_at`, safe_cast(null as INT64) as `from`
union all
select safe_cast('''a''' as STRING) as `id`, safe_cast('''2024-01-02''' as DATETIME) as `loaded_at`, safe_cast(null as INT64) as `from`
) -- some_model
SELECT id, loaded_at FROM __dbt__cte__stg
qualify row_number() over (partition by id order by loaded_at desc) = 1
Output
21:52:49 Previewing node 'test__some_model':
| id | loaded_at |
| -- | ------------------- |
| a | 2024-01-02 00:00:00 |
So I'm wondering if we just need to apply quoted
in just the right place.
@dbeatty10 I think you're right, and I think the right spots might be here and here:
{%- for column_name, column_value in ... %} {{ column_value }} as {{ adapter.quote(column_name) }}{% if not loop.last -%},{%- endif %}
Minimal reprex
``
models/my_model.sql
select 1 as {{ adapter.quote("from") }}
models/_unit.yml
unit_tests:
- name: test___my___model
model: my_model
given: []
expect:
rows:
- {from: 1}
Run the model and its tests:
dbt build -s my_model
Implementation prototype
We'd probably choose to actually implement it differently than this, but here is a prototype that worked with the reprex above:
Closing, since this should proabably be a dbt-adapter
issue: dbt-labs/dbt-adapters#205