jdh33 / open-data-streams

Data validation and profiling for publishing City of Portland open datasets

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Open Data STREAMS

This project involves an evaluation framework for open datasets published by the City of Portland. The goal is to create a data analysis tool which provides quality assurance for datasets made available between bureaus and to the public.

Contributing

If any Issue or Project interests you and our description alone doesn't contain all you'd need to get started, please contact us. We're at https://www.codeforpdx.org/welcome and our channel in the Code for PDX Slack is #open-data-streams

Introductory information about the open data program in Portland.

Download the preliminary draft of the open data handbook in this page:

https://www.smartcitypdx.com/open-data-program

Resources page in data.gov.

Particularly look at the DCAT-US schema for data catalog metadata and data tools

https://resources.data.gov/

https://resources.data.gov/categories/data-tools/

Data manifest information:

https://github.com/GSA/digital-strategy

https://labs.data.gov/dashboard/docs/main#automated_metrics

Project documentation support

Agile user stories:

https://docs.google.com/spreadsheets/d/1VADQ1TJR2p7mCInZBdQ2EppEjZrk9DWK8fKwIka8Eb4/edit?usp=sharing

Open data curation rules taxonomy

(we can start with a selection of these rules)

https://docs.google.com/spreadsheets/d/1KMU1Q4nHR0QTeIVHsIJmnu6g7n0p0YcyRWuZAyvth1Y/edit?usp=sharing

Other Resources

Frictionless data tutorial: https://youtu.be/foao4cou5JM?t=2628
Data package creator: https://create.frictionlessdata.io/
Goodtables.io: http://goodtables.io/dashboard
Goodtables validator: http://try.goodtables.io/

About

Data validation and profiling for publishing City of Portland open datasets


Languages

Language:Python 54.0%Language:HTML 46.0%