agile-lab-dev / witboost-dbt-workload-template

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

witboost

Designed by Agile Lab, Witboost is a versatile platform that addresses a wide range of sophisticated data engineering challenges. It enables businesses to discover, enhance, and productize their data, fostering the creation of automated data platforms that adhere to the highest standards of data governance. Want to know more about Witboost? Check it out here or contact us!

This repository is part of our Starter Kit meant to showcase Witboost's integration capabilities and provide a "batteries-included" product.

DBT Workload Template

Overview

Use this template to automatically create a Workload based on a dbt project that can be referenced from other components inside the platform.

This component does not provision any resources or infrastructure, and as such can be used without any Specific Provisioner.

What's a Template?

A Template is a tool that helps create components inside a Data Mesh. Templates help establish a standard across the organization. This standard leads to easier understanding, management and maintenance of components. Templates provide a predefined structure so that developers don't have to start from scratch each time, which leads to faster development and allows them to focus on other aspects, such as testing and business logic.

For more information, please refer to the official documentation.

What's a Workload?

Workload refers to any data processing step (ETL, job, transformation etc.) that is applied to data in a Data Product. Workloads can pull data from sources external to the Data Mesh or from an Output Port of a different Data Product or from Storage Areas inside the same Data Product, and persist it for further processing or serving.

DBT

dbt (Data Build Tool) is a transformation tool that enables data analysts and engineers to transform, test, and document their data inside their data warehouse more effectively. It allows you to write transformations as SQL code, maintaining the logic in a version-controlled environment. With dbt, you can build transformation pipelines that consist of simple SELECT statements and views, creating a chain of transformations that shape your raw data into a useful form for analysis. Besides, dbt also supports data testing, ensuring the validity and quality of the transformed data, and it can automatically generate documentation, making it easier for teams to understand their datasets.

Learn more about it on the official website.

Usage

To get information on how to use this template, refer to this document.

Component Testing

To verify the component before deploying it along with the Data Product, the component needs to be tested against a CUE Policy defined for DBT Workload. This policy needs to be defined inside the Governance section of the Witboost Platform.

For more information, please refer to the official documentation.

Artifacts

This project uses Python setuptools and build for packaging. Build artifacts with:

python -m build --wheel

If you wish to change the default version, you can do so by going inside the version.py file and change it to suit your needs.

License

This project is available under the Apache License, Version 2.0; see LICENSE for full details.

About us

Agile Lab

Agile Lab creates value for its Clients in data-intensive environments through customizable solutions to establish performance driven processes, sustainable architectures, and automated platforms driven by data governance best practices.

Since 2014 we have implemented 100+ successful Elite Data Engineering initiatives and used that experience to create Witboost: a technology-agnostic, modular platform, that empowers modern enterprises to discover, elevate and productize their data both in traditional environments and on fully compliant Data mesh architectures.

Contact us or follow us on:

About

License:Apache License 2.0


Languages

Language:CUE 92.7%Language:Python 7.3%