maciejd / dataEngineeringTemplate

Template for Data Engineering and Data Pipeline projects

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

dataEngineeringTemplate

Template for Data Engineering and Data Pipeline projects

Project Overview

This is a high level description of the project, what it is trying to accomplish.

  1. Add your requirements to the requirements.txt file for Python pip packages.
  2. Add any nessesary installations to the Dockerfile.

Architecture

This is a high level description of the tool(s) and decisions around why those tool(s) were choosen.

Testing

This is instructions on how to test this repo. All tests are located inside the tests folder. We are using pytest. Run the following steps.

  1. docker build --tag my-project .
  2. docker-compose up test

Add your unit tests to files inside the tests folder ... name your files test_somename.py

Data Flow

High level description of data source(s) and sink(s), as well as the general pattern and data flow through the pipeline. Discuss any assumptions made.

About

Template for Data Engineering and Data Pipeline projects


Languages

Language:Dockerfile 72.5%Language:Python 27.5%