Northwind Database OLTP to OLAP Transformation: Leveraging Dimensional Modeling for Advanced Analytics
This project unlocks the power of advanced analytics and reporting by transforming an OLTP architecture into an efficient OLAP system. It Leverages the capabilities of DBT and BigQuery to implement dimensional modelling and drive data-driven decision-making.
To modernise data reporting solution for Northwind through Dimensional Modeling.
What is the current architecture?
- Northwind traders are export-import companies who trade special foods around the world
- This is a sample database created by Microsoft to demonstrate the features of some of its products, and for training and tutorials.
- The existing architecture is a mix of on-premise and legacy systems
- They use Mysql for their main sales daily transactions
- They use Mysql to build and run reports which were not efficient as the analytical queries impacted the processing speed of the transactions system
Why the need for a new architecture?
- For better scalability
- To improve reporting speed
- To reduce the load on operational systems
- To improve data security through better access control
How do we implement a new architechture?
- Northwind traders can migrate an existing database to GCP
- MySQL on-prem can be replaced by a fully managed cloud SQL
- For reporting solutions, an OLAP data warehouse on GCP using Bigquery will be built
- Dimensional Datawarehouse will be built on Bigquery using Kimballs approach with dim and fact tables
There are many business Processes that can be derived from the Northwind database through the E-R diagram. However, we will be focusing on three processes:
- Sales Overview: Overall sales reports to understand better, what is being sold to our customers, what sells the most, where and what sells the least, the goal is to have a general overview of how the business is going.
- Product Inventory: Understand the current inventory levels and how to improve stock management, what suppliers we have, and how much is being purchased. This will allow Northwind to understand stock management and potentially land better deals with suppliers
- Customer Reporting: Allow customers to understand their purchase orders, how much and when they are buying, empowering them to make data-driven decisions while Northwind utilizes this data in combination with its sales data.
From the image below you can find the three layers (datasets) created in Bigquery through DBT. They are identified by the "dbt prefix"
- Conceptual Data Model
- Logical Data Model
- Physical Data Model
- The new Data Warehouse uses Bigquery for analytics and Business Intelligence which is more efficient than the previous MySQL system.
- The Reporting is derived from One Big Table denormalised from Dimensional models
- Sales Overview, Product Inventory, and Customer Reporting processes can now be carried out effectively to draw out insights
- Commands to install dbt and connect to bigquery here
- Commands to create tables and insert data here
- Commands to create Dim and Fact tables in different layers can be found here
- If you are not able to enable billing for Bigquery on your account, insert data manually by uploading csv files located here
- Learn more about dbt in the docs
- Check out Discourse for commonly asked questions and answers
- Join the chat on Slack for live discussions and support
- Find dbt events near you
- Check out the blog for the latest news on dbt's development and best practices