activewarehouse / activewarehouse-etl

Extract-Transform-Load library from ActiveWarehouse

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

date transformed to 0000-00-00 00:00:00 instead of nil

sgrgic opened this issue · comments

Hey,

So I got this error yesterday and here is use case:

  • load row where deleted_at col is nil
  • transform: transform(:deleted_at) {|name, value, row| value ? value.to_time : value}
  • destination is :file
  • load fata from file to MYSQL db table
  • loaded value of deleted_at col for specific row is 0000-00-00 00:00:00 instead of nil

As you know in rails if you use acts_as_paranoid all rows with deleted_at not nil will be ignored in where clause.
So this breaks some functionality in our system.
Workaround for this is in lib/etl/control/destination/file_destination.rb, you can see that in two commits I made in my fork
sgrgic@521aa93
sgrgic@0e19ce7

Regards,
Sinisa.

Hi Sinisa!

thanks for reporting this. You are using the bulk load processor after the file destination, is that right?

So do you mean mysql bulk load can understand "\N" to mean NULL?

That would be pretty cool. Is that available in all MySQL versions?

I'll create a mysql-specific test and will bring your commits in the feature branch, then I will show you (if needed) how to launch the tests yourself for the next patches you may have to do.

What do you think?

Mmm - the thing I wonder about is how other databases (such as postgresql or sqlserver) will handle the \N - in which case we may have to make sure this is configurable. I'll do more research on my side, but please let me know if you find anything!

There is more information (grep NULL) available here:

http://dev.mysql.com/doc/refman/5.1/en/load-data.html

Yup, bulk import from csv file and yeah mysql understand \N as null. Otherwise it is empty string and mysql loads it as those bunch of zeroes.

Thanks for quick answer!

This looks like a mis-configured MySQL install - you can configure what MySQL should do about invalid dates with the NO_ZERO_DATE option. The invalid date attempted inserted was coerced to a zero-date. Does that make sense?