atgoud / mobydq

:whale: Tool to automate data quality checks on data pipelines

Home Page:https://mobydq.github.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MobyDQ

License CircleCI

MobyDQ is a tool for data engineering teams to automate data quality checks on their data pipeline, capture data quality issues and trigger alerts in case of anomaly, regardless of the data sources they use.

Data pipeline

This tool has been inspired by an internal project developed at Ubisoft Entertainment in order to measure and improve the data quality of its Enterprise Data Platform. However, this open source version has been reworked to improve its design, simplify it and remove technical dependencies with commercial software.

Getting Started

Skip the bla bla and run your data quality indicators by following the Getting Started page. The complete documentation is also available on Github Pages: https://mobydq.github.io.

Run Tests

You can run all tests locally using the following commands:

$ cd mobydq
$ # Backend
$ test/run-tests.sh
$ # Frontend
$ app/run-container.sh npm run test

Run Linter

Depending on the used editor, eslint and pylint can be integrated. You can run all linters locally using the following commands:

$ cd mobydq
$ # Backend
$ test/run-linter.sh
$ # Frontend
$ app/run-container.sh npm run lint

Dependencies

Docker Images

The containers run by docker-compose have dependencies with the following Docker images:

Python Packages

About

:whale: Tool to automate data quality checks on data pipelines

https://mobydq.github.io

License:Apache License 2.0


Languages

Language:JavaScript 36.9%Language:Python 36.5%Language:Shell 17.9%Language:PLpgSQL 5.6%Language:Dockerfile 1.5%Language:HTML 1.3%Language:CSS 0.3%