ScalefreeCOM / datavault4dbt

Scalefree's dbt package for a Data Vault 2.0 implementation congruent to the original Data Vault 2.0 definition by Dan Linstedt including the Staging Area, DV2.0 main entities, PITs and Snapshot Tables.

Home Page:https://www.scalefree.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hash Datatypes other than STRING cause an error

tkirschke opened this issue · comments

Discussed in #47

Originally posted by mjahammel January 5, 2023
Are there allowed values for the variable hash_datatype other than STRING? I tried using BINARY (target is Snowflake), and am getting a couple of different errors, depending on the hash type selected. If the variable hash is set to SHA1 or SHA2, the error messages are "Invalid argument types for function 'LOWER': (BINARY(20))", "Invalid argument types for function 'LOWER': (BINARY(64))", respectively.

An example of the generated code for a hashkey is:

IFNULL(LOWER(SHA2_BINARY(NULLIF(CAST(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(UPPER(CONCAT(
        IFNULL((CONCAT('\"', REPLACE(REPLACE(REPLACE(TRIM(CAST("BRANCH_CODE" AS STRING)), '\\', '\\\\'), '"', '\"'), '^^', '--'), '\"')), '^^')
        )), '\n', '')
        , '\t', '')
        , '\v', '')
        , '\r', '') AS STRING), '^^'))), 'TO_BINARY(0000000000000000000000000000000000000000000000000000000000000000)') AS DV_HASHKEY_HUB_BRANCH_TEST

The problem seems to be the call to LOWER immediately after the IFNULL. Also, the default value seems to be incorrect. Instead of:

'TO_BINARY(0000000000000000000000000000000000000000000000000000000000000000)'

should it not be:

TO_BINARY('0000000000000000000000000000000000000000000000000000000000000000')

Thanks,
Maury

To Do's

  • Remove LOWER function in hash default values, in case something else than STRING is defined as hash_datatype
  • Fix default unknown and default error value for SHA hash algorithms