[BUG] Invalid SQL compiled when extracting relation columns from duckdb adapter
qmg-tmay opened this issue · comments
Is this a new bug in dbt-expectations?
- I believe this is a new bug in dbt-expectations
- I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
Macros relying on relation column names build invalid SQL when using DuckDB adapter.
For example, using dbt_expectations.expect_table_columns_to_match_set
causes the following error:
Runtime Error in test dbt_expectations_source_expect_table_columns_to_match_set_local_input_id__name__property_type (models/sources.yml)
Parser Error: syntax error at or near ")"
LINE 10: ),
^
If I run the tests with the debug flag, I can see the compiled SQL looks like this:
select
count(*) as failures,
count(*) != 0 as should_warn,
count(*) != 0 as should_error
from (
with relation_columns as (
),
input_columns as (
select cast('ID' as TEXT) as input_column
union all
select cast('NAME' as TEXT) as input_column
union all
select cast('PROPERTY_TYPE' as TEXT) as input_column
)
select *
from
relation_columns r
full outer join
input_columns i on r.relation_column = i.input_column
where
-- catch any column in input list that is not in the list of table columns
-- or any table column that is not in the input list
r.relation_column is null or
i.input_column is null
) dbt_internal_test
The relation_columns
CTE contains nothing leading to a syntax error.
I have experienced the same issue when using the dbt_expectations.expect_column_to_exist
test.
Expected Behavior
The macro should compile to valid SQL.
Steps To Reproduce
My dbt_project.yml
is:
name: 'validation_spike'
version: '1.0.0'
config-version: 2
profile: 'validation_spike'
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]
clean-targets:
- "target"
- "dbt_packages"
models:
validation_spike:
# Config indicated by + and applies to all files under models/example/
+materialized: view
vars:
'dbt_date:time_zone': "Europe/London"
My profiles.yml
file is:
validation_spike:
target: dev
outputs:
dev:
type: duckdb
path: ":memory:"
My models/sources.yml
file is:
version: 2
sources:
- name: local
meta:
external_location: "~/<path>/validation_spike/data/{name}.csv"
tables:
- name: input
description: Valid input data
tests:
- dbt_expectations.expect_table_columns_to_match_set:
column_list: ["id", "name", "property_type"]
columns:
- name: id
description: Customer id
tests:
- unique
- not_null
- name: name
description: Customer name
tests:
- not_null
- name: property_type
description: Property type
tests:
- accepted_values:
name: invalid_property_type
values: ["bungalow", "house", "flat/maisonette"]
My dependencies.yml
file is:
packages:
- package: calogica/dbt_expectations
version: 0.10.0
My test file is (input.csv):
id,name,property_type
1,Sam,bungalow
2,Josh,flat/maisonette
3,Toby,house
Relevant log output
Runtime Error in test dbt_expectations_source_expect_table_columns_to_match_set_local_input_id__name__property_type (models/sources.yml)
Parser Error: syntax error at or near ")"
LINE 10: ),
^
Environment
- OS: MacOs Monterey 12.7
- Python: 3.11.0
- dbt: 1.6.6 (dbt-duckdb 1.6.1)
- dbt-expectations: 0.10.0
Which database adapter are you using with dbt?
dbt-duckdb
Note: dbt-expectations currently does not support database adapters other than the ones listed below.
- Postgres
- Snowflake
- BigQuery