activewarehouse / activewarehouse-etl

Extract-Transform-Load library from ActiveWarehouse

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SCD with no changes causes bad query

epinault opened this issue · comments

Scenario:

  • I have a File to File transform type. I run it once and get data loaded correctly. I run it again and no changes were made.
  • I m using SCD type 2 changes in the definition of the out and also the natural key exists.

Results

When trying to make some SCD type 2 changes, and no changes were made in a dimension, , line 399
from lib/etl/control/destination.rb is being called and throw an error and breaks with

Error writing to #ETL::Control::FileDestination:0xa981598: PG::Error: ERROR: syntax error at end of input
LINE 1: DELETE FROM board_accounts WHERE id =

The primary key is nil in that case and completely breaks

I can't comment on this one for now - some investigation is needed.

Let me know what you need. I can try to create a small scenario is some repo if you need?

Given my current lack of available time and my current knowledge about how SCD work in the gem (none!), I'd be happy if you could have a closer look and diagnose this (if this specific feature is important for you, that is).

I'd be glad to incorporate a well-tested change on that topic.

Yea, I am reading the code right now. Something seems not right about SCD . Who would know about it? I posted in the google group but seems like no one is looking at it yet :(

SCD seemed like a mess when I looked at it awhile back - I don't use it, so I can't help much on it.
If there were a small example/scenario, I would be vastly more likely to help out with this.

Probably a very focus reproduction would be a good basis so that we all learn how SCD works currently :)

@kookster How do you guys deal with it then if not using it with the framework? I ll have some focus repro tomorrow I think

@epinault-ttc my use cases when working with aw-etl never required SCD, so far, so I'm not using them!

We did not need SCD.

We update the records in the dim to match the latest info from the source - we do not track the changes.

For a bit I considered using SCD in case we needed to track the changes, then decided we didn't really need the info, so it wasn't worth the effort to track it.