ScalefreeCOM / datavault4dbt

Scalefree's dbt package for a Data Vault 2.0 implementation congruent to the original Data Vault 2.0 definition by Dan Linstedt including the Staging Area, DV2.0 main entities, PITs and Snapshot Tables.

Home Page:https://www.scalefree.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Same hashdiff inserted again in Satellite, without having in between delta states

bschlottfeldt opened this issue · comments

Problem statement

Encountered while testing in my project this problem.
We have a Satellite that is holding a hashkey lets say for example '123' and hashdiff 'abcd' with load date '2023-09-08 13:08:04.421'
We load new records into the Stage with load date '2023-09-08 14:54:00.818' (few hours later). But those records dont have any delta, holding multiple records (from different files) the same hashdiff for the same hashkey.
The current implementation inserts the same hashdiff again into the satellite, which is not the expected behaviour.

Expected behaviour

None of the records are inserted into the satellite, as they hold the same hashdiff as the latest hashdiff present in the satellite and there are no inbetween deltas in the Stage.

Problem is in the CTE deduplicated_numbered_source, which applies the QUALIFY at the same time it is adding a ROW_NUMBER.