datopian / datahub-qa

:package: Bugs, issues and suggestions for datahub.io

Home Page:https://datahub.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wrong behaviour when converting YYYY-MM to YYYY-MM-DD

anuveyatsu opened this issue · comments

When my CSV file is processed, it converts YYYY-MM values into YYYY-MM-DD, which is OK for me. However, it uses today's day as a value for DD, while I expect day value to be equal to 01. For example, if I push a dataset on 18 July, my 2018-07 value will be converted to 2018-07-18.

How to reproduce

Expected behavior

Values either not converted or it uses 01 as the value for the day.

@akariv transformation is done by dpp as much as I remember, do you think it's reasonable to fix as proposed above? Think this is the commit that made it available to transform. frictionlessdata/datapackage-pipelines@f3d83c6

Personally, I think it's kind of more accurate when saying today instead of 01. What if it actually is not 01 and date is Eg end of the month 30 or middle15?

Also, let's say we pushed dataset on 20th of the month (due to some fix). The schedule is calculated from that date + one month. So we are collecting data every 20th instead of 1st and people have to wait 20 days to get the newest data. We need to push automated datasets at the beginning of the day/month/year anyway or have the ability to schedule runs on X date instead of every X interval (or accept both). This would solve both problems.

Generally, think we should switch using dataflows and Travis schedules for automated uploads instead of old push-flow and both problems would be solved this way as well.

WDYT @akariv @anuveyatsu

\cc @Branko-Dj This would be the great start for using dataflows