There are 46 repositories under databricks topic.
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
🏆 零代码、全功能、强安全 ORM 库 🚀 后端接口和文档零代码,前端(客户端) 定制返回 JSON 的数据和结构。 🏆 A JSON Transmission Protocol and an ORM Library 🚀 provides APIs and Docs without writing any code.
Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
Code examples and resources for DBRX, a large language model developed by Databricks
🔥🔥🔥 Open Source Alternative to Hightouch, Census, and RudderStack - Reverse ETL & Customer Data Platform (CDP)
DataOps for the Modern Data Warehouse on Microsoft Azure. https://aka.ms/mdw-dataops.
🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.
Databricks Terraform Provider
Compare MLOps Platforms. Breakdowns of SageMaker, VertexAI, AzureML, Dataiku, Databricks, h2o, kubeflow, mlflow...
This repo provides a customizable stack for starting new ML projects on Databricks that follow production best-practices out of the box.
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Capture deep metrics on one or all assets within a Databricks workspace
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.
Manage your Databricks deployments and CI with code.
Apache Spark Connector for Azure Cosmos DB
Automated migrations to Unity Catalog
Examples of using Terraform to deploy Databricks resources
A set of UDFs and Procedures to extend BigQuery, Snowflake, Redshift, Postgres and Databricks with Spatial Analytics capabilities
Scalable Data Science, course sets in big data Using Apache Spark over databricks and their mathematical, statistical and computational foundations using SageMath.
machine learning for genomic variants
Tools for Deploying Databricks Solutions in Azure
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.
Collection of Sample Databricks Spark Notebooks ( mostly for Azure Databricks )