There are 13 repositories under dataquality topic.
Always know what to expect from your data.
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
ML powered analytics engine for outlier detection and root cause analysis.
The premier open source Data Quality solution
Open Source Data Quality Monitoring.
Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility across and down your data estate. Save time with simple, fast data quality test generation and execution. Trust your data, tools, and systems end to end.
Make simple storing test results and visualisation of these in a BI dashboard
DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data quality test generation and execution by data profiling, new dataset hygiene review, AI generation of data quality validation tests, ongoing testing of data refreshes, & continuous anomaly monitoring
Datailot-cli is the command line interface for accessing the AI teammate for engineers to ensure best practices in their SQL and dbt projects.
Open Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.
Tutorial and examples of Data Quality in Big Data System
Run greatexpectations.io on ANY SQL Engine using REST API. Supported by FastAPI, Pydantic and SQLAlchemy as best data quality tool
Huemul BigDataGovernance, es una framework que trabaja sobre Spark, Hive y HDFS. Permite la implementación de una estrategia corporativa de dato único, basada en buenas prácticas de Gobierno de Datos. Permite implementar tablas con control de Primary Key y Foreing Key al insertar y actualizar datos utilizando la librería, Validación de nulos, largos de textos, máximos/mínimos de números y fechas, valores únicos y valores por default. También permite clasificar los campos en aplicabilidad de derechos ARCO para facilitar la implementación de leyes de protección de datos tipo GDPR, identificar los niveles de seguridad y si se está aplicando algún tipo de encriptación. Adicionalmente permite agregar reglas de validación más complejas sobre la misma tabla.
:zap: Prevent downstream data quality issues by integrating the Soda Library into your CI/CD pipeline.
Code for data quality with greatexpectations blog
Open source clients for working with Data Culpa Validator services from data pipelines
Make your dataset talk to you. The AI assistant for data preparation.
Enhance your data testing seamlessly with this Dataform package, featuring robust common assertions to ensure the accuracy and integrity of your warehouse data.
🦆 Blazing Fast and highly customizable Github Action to setup a DuckDb runtime
SQL based data profiling & data quality checks, which will help you to perform data profiling & data quality checks on SQL database at table & database level.