ubisoft / mobydq

:whale: Tool to automate data quality checks on data pipelines

Home Page:https://ubisoft.github.io/mobydq/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MobyDQ

License tests

MobyDQ

MobyDQ is a tool for data engineering teams to automate data quality checks on their data pipeline, capture data quality issues and trigger alerts in case of anomaly, regardless of the data sources they use.

This tool has been inspired by an internal project developed at Ubisoft Entertainment in order to measure and improve the data quality of its Enterprise Data Platform. However, this open source version has been reworked to improve its design, simplify it and remove technical dependencies with commercial software.

Data pipeline

Getting Started

Skip the bla bla and run your data quality indicators by following the Getting Started page. The complete documentation is also available on Github Pages: https://ubisoft.github.io/mobydq.

Screenshots

Some screenshot of the web application to give you a taste of how it's like.

Demo

Run Dev

Run MobyDQ in development mode with the following command:

$ cd mobydq
$ docker-compose -f docker-compose.yml -f docker-compose.dev.yml up db graphql app nginx

Run Prod

Run MobyDQ in production mode with the following command. The argument -d is to run containers in the background as daemons.

$ cd mobydq
$ docker-compose up -d db graphql app nginx

Run Tests

You can run tests using the following commands:

$ cd mobydq

# Start test database instances
$ docker-compose -f docker-compose.yml -f docker-compose.test.yml up -d db graphql
$ docker-compose -f docker-compose.yml -f docker-compose.test.yml up -d db-cloudera db-mysql db-mariadb db-postgresql db-sql-server

# Run tests
$ docker-compose -f docker-compose.yml -f docker-compose.test.yml up test-db test-scripts

# Run linter
$ docker-compose -f docker-compose.yml -f docker-compose.test.yml build test-scripts test-lint-python
$ docker run --rm mobydq-test-lint-python pylint scripts test

Dependencies

Docker Images

Python Packages

JavaScript Packages

  • To be documented

About

:whale: Tool to automate data quality checks on data pipelines

https://ubisoft.github.io/mobydq/

License:Apache License 2.0


Languages

Language:Vue 36.5%Language:Shell 23.3%Language:Python 19.0%Language:JavaScript 7.4%Language:PLpgSQL 6.7%Language:TSQL 4.5%Language:Dockerfile 1.8%Language:HTML 1.0%