There are 45 repositories under databricks topic.
Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
Code examples and resources for DBRX, a large language model developed by Databricks
🔥🔥🔥 Open Source Alternative to Hightouch, Census, and RudderStack.
DataOps for the Modern Data Warehouse on Microsoft Azure. https://aka.ms/mdw-dataops.
Synmetrix – open source semantic layer / Boost your LLM precision
🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.
Databricks Terraform Provider
Compare MLOps Platforms. Breakdowns of SageMaker, VertexAI, AzureML, Dataiku, Databricks, h2o, kubeflow, mlflow...
This repo provides a customizable stack for starting new ML projects on Databricks that follow production best-practices out of the box.
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Capture deep metrics on one or all assets within a Databricks workspace
Manage your Databricks deployments and CI with code.
Apache Spark Connector for Azure Cosmos DB
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.
A set of UDFs and Procedures to extend BigQuery, Snowflake, Redshift, Postgres and Databricks with Spatial Analytics capabilities
Examples of using Terraform to deploy Databricks resources
Your best companion for upgrading to Unity Catalog. UCX will guide you, the Databricks customer, through the process of upgrading your account, groups, workspaces, jobs etc. to Unity Catalog.
Scalable Data Science, course sets in big data Using Apache Spark over databricks and their mathematical, statistical and computational foundations using SageMath.
machine learning for genomic variants
Tools for Deploying Databricks Solutions in Azure
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.