How to prepare data & create a dataset.
AcckiyGerman opened this issue · comments
As a new datahub user I want to learn how to prepare the data and create a data package.
Expected behavior
User reads the documentation and understands:
- what is data-packages
- how to get data (scraping web, pdf, csv, source api, etc)
- how to to clean and transform data (and why)
- how to create a data-package and share it
Tasks
- create a new section in the docs ( name ?)
- documents
- Introduction: what is the data package
- how to get data (different tools)
- how to wrangle and clean data (and why). Integrate this docs, too:
- http://okfnlabs.org/handbook/data/ (extract the relevant bits)
- https://github.com/rufuspollock/command-line-data-wrangling
- creating a datapackage and sharing it
FIXED on 90%, also I'd like to add scraping section
@Branko-Dj you can edit the doc here when scraping
paragraph is ready: https://github.com/datahq/datahub-content/blob/master/docs/getting-started/datapackage-find-prepare-share-guide.md
@AcckiyGerman Ok, will do