Bl3f / data-drift

Data versioning and diffing ⭐️ You can star to support our work!

Home Page:https://www.data-drift.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool


Datadrift logo

Data-Drift Build Storybook DataGit version

Data versioning and diffing

Datadrift is an agnostic and lightweight storage and version-control technology to track changes to mutable data sources

Website · Blog · Issues

DataDrift

👋 About

🥵 Storing and handling data history is complex and expensive

Tools, databases and warehouses have a hard time tracking and displaying historical changes.

  • For a very large majority of companies, there no access to historical state of their own data (ie. how data changes over time)

  • For a selected few, keeping track of historical changes is made at a great cost of data engineering and outdated modeling trade-offs

👉 Open-source versioned storage and dedicated tools to work with data history

DataDrift makes handling data history easy with modern and open-source version control tools for data.

Simple & ligthweight techno for all

  • Easy to implement (<15min): Add 1 line of code in your pipeline to historize data. Use the one-click install on your CRM, spreadsheet or any data source (coming soon, open an issue to request a specific connector, or contribute to the community building it directly 😇)

  • Free: Reduce your storage and optimize your warehouse bill with our ligthweight storage for data history. Storage is done in a dedicated git repository, no additional cost if you use Github.

Open-source, Open Architecture

  • Secure: Deploy on your own infra to keep 100% control over your data and access

  • Flexible: compose your own Datadrift based on our building blocks

  • Integrated: not another tool to manage in your stack, DataDrift is API-first and stays within your current tools


⚡️ Use cases

Unlock targeted use-cases with specific tools on top of our versioning and diffing technology.

Here are some examples of how users leverage Datadrift.

🔔 Monitoring drift with custom alerting

DataDrift new drift custom alerting

How can you expect a data analyst to detect a data quality issue when all they see is a number that is slightly higher or lower on each report?

Become aware of unknown unknowns in your data quality with data or metric drift alerting. Monitor the quality and consistency of your reporting and metrics over time.

🔬 Troubleshooting & data reconciliation

DataDrift metric drift changelog

Operationalize your monitoring and solve your underlying data quality issue with drill-down across historical data to understand the root cause of the problem.

🔄 Safe database/ERP/CRM migrations

DataDrift diff compare table

Migrate without hassle and safely between tools with comparison and diff-checks before/after and within/across databases.

🧠 And much more

We'd love to hear from you if you have any other use case. Just open a new issue to tell us more about it and see how we could help!


🚀 Quickstart

Install our versioning and diffing library

Install Datagit to historise and diff-checks the data you want.

This is a mandatory step to unlock any use cases on top i. You can learn more about Datagit in this article.

Deploy Datadrift locally

Follow our step-by-step installation guide to use Datadrift.

Use our cloud-based product

Contact our team by filling the form on our website to get started with Datadrift Cloud.


💚 Helping us

We 💚 contributions big and small. In priority order (although everything is appreciated) with the most helpful first:


🗓 Upcoming features

Coming Soon

🌀 Automatic lineage drill-down and diff checks. Learn more about this feature

Coming later this year

🗓 Warehouse (BigQuery, Snowflake) & databases (Postgres, MongoDB) native integrations

🗓 BI tools integration

🗓 Gsheet integration

Track planning on Github Projects and help us prioritising by upvoting or creating issues.

About

Data versioning and diffing ⭐️ You can star to support our work!

https://www.data-drift.io/

License:GNU General Public License v3.0


Languages

Language:Go 30.1%Language:TypeScript 27.4%Language:Jupyter Notebook 25.4%Language:Python 16.0%Language:CSS 0.6%Language:JavaScript 0.3%Language:HTML 0.1%Language:Dockerfile 0.1%