compare_column_values() does not work on dbt Cloud
davesgonechina opened this issue · comments
Describe the bug
The examples for compare_column_values (source)
use print_table()
but according to Fishtown Analytics support, print_table()
does not work in dbt Cloud. There should at minimum be a disclaimer in the README, especially for the Advanced Usage, as the SQL is generated but not the table, forcing the user to copy paste and run each compiled SQL manually.
Steps to reproduce
Follow the instructions for usage or advanced usage of audit_helper.compare_column_values()
in dbt Cloud and execute with dbt run-operation
In my case, this:
{% macro audit_my_table() %}
{%- set columns_to_compare=adapter.get_columns_in_relation(ref('my_new_table')) -%}
{% set old_etl_relation_query %}
select * from my_old_table
{% endset %}
{% set new_etl_relation_query %}
select * from {{ ref('my_new_table') }}
{% endset %}
{% if execute %}
{% for column in columns_to_compare %}
{{ log('Comparing column "' ~ column.name ~'"', info=True) }}
{% set audit_query = audit_helper.compare_column_values(
a_query=old_etl_relation_query,
b_query=new_etl_relation_query,
primary_key="my_pk",
column_to_compare=column.name
) %}
{% set audit_results = run_query(audit_query) %}
{% do audit_results.print_table() %}
{{ log(audit_query, info=True) }}
{% endfor %}
{% endif %}
{% endmacro %}
When I dbt run-operation
this macro, nothing happens at all in the logs, and there's some circumstantial evidence that it messes up my dbt Cloud Develop pod leading to basic queries of (the latter was due to non-breaking spaces)my_new_table
and possibly others failing.
If I remove the print_table()
line, the above compiles and prints the first query in the logs successfully, it successfully compiles all the queries, and if I copy-paste-run any of those from the logs into a statement tab and run it, they work fine and I see the expected table in the results tab.
Expected results
A markdown formatted table based on a successfully compiled SQL query in the logs similar to the ones in the README.
Actual results
The macro will run "successfully" and compares every column but does not print any of the markdown tables in the logs.
System information
packages:
- package: fishtown-analytics/dbt_utils
version: 0.6.4 - package: fishtown-analytics/codegen
version: 0.3.1 - package: fishtown-analytics/audit_helper
version: 0.3.0 - package: gitlabhq/snowflake_spend
version: 1.2.0
dbt_date includes dbt_utils
- package: calogica/dbt_date
version: [">=0.3.0"]
Which database are you using dbt with?
- postgres
- redshift
- bigquery
- snowflake
- other (specify: ____________)
The output of dbt --version
:
0.19.1
The operating system you're using:
N/A
The output of python --version
:
N/A
Additional context
There is a support ticket
Are you interested in contributing the fix?
Yes.
I've had some success substituting {% do log(audit_results.rows.values(), info=True) %}
to print row tuples to dbt Cloud's stderr
logs, perhaps the README could warn dbt Cloud users of the limitation around Agate print_table()
outputting to stdout
, which is not visible in the dbt Cloud UI, and suggest using this workaround.