datopian / datahub-qa

:package: Bugs, issues and suggestions for datahub.io

Home Page:https://datahub.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Epic] Polish Core Datasets

sglavoie opened this issue · comments

As a PM, I want to review core datasets and make sure all of them are up to date so that I'm sure in their quality

As a PM, I want to re-run and make sure scrapers/scripts are working fine so that I can update data anytime

As a PM, I want to translate scrapers/scripts into dataflows, where possible, so that we use our tools to get the data, plus we can easily update them

As a PM, I want to review READMEs of the core dataset and update where necessary, so that I (and users) are sure that dataset descriptions are accurate enough

Acceptance Criteria

  • We have the latest data for all core datasets
  • We have all of the non-complex scripts working OK (unless the source is broken)
  • We have the missing sources and complex scripts fixed up
  • READMEs are up to date
  • We use dataflows to get the data
    • Automated by Travis

Tasks

  • List all datasets
  • Find and fix the non-complex scripts that are not currently working and review READMEs
  • Fix scripts that are complex and have non-common errors
  • translate scripts to dataflows
  • Fix/refactor more simple scripts addendum #267
  • Fix the broken source datasets #266
  • Fix scripts that require further analysis and debugging #265
  • Run on schedule by travis

Created by @zelima