tylerwmarrs / data-engineering-project-doc-templates

This is project documentation templates derived from CRISP-DM to be used for Data Engineering projects.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Data Engineering Project Documentation Templates

This repository is used to provide guidance in a standard data engineering project that consists of a data lake and data warehouse. The documentation originated out of a need to standardize a requirements gathering methodology. It is derived from the CRISP-DM (Cross Industry Standard Process for Data Mining).

Usage

There are nine templates numbered in logical order within the templates directory. These templates have text in italics that is used for reference purposes. You may clone, modify, or fork the repository at your leisure.

Note that some documentation processes may overlap as you learn more about your project. Do not feel obligated to fill everything out in sequence. Generally you will fill out the first few documents in order and adjust as needed. For more details, learn about CRISP-DM.

Contributing

My goal in publishing these templates is to make it teach others how to formalize a process around data engineering. It would be awesome for the community to expand on some of the templates to make them more featureful.

To contribute, fork this repository and open a pull request with your changes.

About

This is project documentation templates derived from CRISP-DM to be used for Data Engineering projects.

License:Apache License 2.0