There are 8 repositories under lakehouse topic.
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
World's most powerful data catalog service with providing a high-performance, geo-distributed and federated metadata lake.
A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.
Examples of using Terraform to deploy Databricks resources
A curated list of open source tools used in analytical stacks and data engineering ecosystem
Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing.
DeltaOMS is a solution that help build a centralized repository of Delta Transaction logs and associated operational metrics/statistics for your Delta Lakehouse. Unity Catalog supported in the v0.7.0-rc1 release.Documentation here - https://databrickslabs.github.io/delta-oms/v0.7.0-rc1/
Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data
A curated list of awesome Online Analytical Processing databases, frameworks, ressources and other awesomeness.
Repositório dedicado a Workshop de Data Lakehouse com Delta Lake
Unlocking the Power of Health Data With a Modern Data Lakehouse
Microsoft Fabric Real-time Analytics flight streaming
Tutorials and examples of how to deploy Presto and connect it to different data sources
FLaNK AI Weekly covering Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Apache Ozone, Apache Pulsar, and more...
Stream Loader for Apache Doris
Genomic BigData Warehousing with Apache Spark and LakeHouse Architecture
Automated provisioning of an industry Lakehouse with enterprise data model
Build Your First End-to-End Lakehouse Solution (aka.ms/fabconlake)
A comprehensive educational resource hub dedicated to mastering Microsoft Fabric, offering in-depth tutorials, real-world use cases, and hands-on guides for seamless end-to-end analytics
Leverage the Databricks Solution Accelerator for DNS analytics to accelerate time to detection and response across petabytes of data. Tap into DNS traffic logs, enrich streaming threat intelligence, and apply advanced analytics to detect DNS abnormalities and prevent malicious attacks.
Supercharge Your Compute for Analytics & AI
Overall Equipment Effectiveness: Performant and Scalable End-to-End Equipment Monitoring