Databricks Labs (databrickslabs)

Databricks Labs

databrickslabs

Geek Repo

Labs projects to accelerate use cases on the Databricks Unified Analytics Platform

Home Page:https://databricks.com/learn/labs

Github PK Tool:Github PK Tool

Databricks Labs's repositories

dbx

šŸ§± Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.

Language:PythonLicense:NOASSERTIONStargazers:434Issues:23Issues:415

tempo

API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:306Issues:23Issues:95

dbldatagen

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines

Language:PythonLicense:NOASSERTIONStargazers:300Issues:14Issues:75

mosaic

An extension to the Apache Spark framework that allows easy and fast processing of very large geospatial datasets.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:269Issues:11Issues:174

overwatch

Capture deep metrics on one or all assets within a Databricks workspace

Language:ScalaLicense:NOASSERTIONStargazers:223Issues:30Issues:738

ucx

Automated migrations to Unity Catalog

Language:PythonLicense:NOASSERTIONStargazers:212Issues:214Issues:1066

dlt-meta

Metadata driven Databricks Delta Live Tables framework for bronze/silver pipelines

Language:PythonLicense:NOASSERTIONStargazers:136Issues:19Issues:36

dataframe-rules-engine

Extensible Rules Engine for custom Dataframe / Dataset validation

Language:ScalaLicense:NOASSERTIONStargazers:134Issues:13Issues:20

discoverx

A Swiss-Army-knife for your Data Intelligence platform administration.

Language:PythonLicense:NOASSERTIONStargazers:102Issues:5Issues:13

geoscan

Geospatial clustering at massive scale

Language:ScalaLicense:NOASSERTIONStargazers:94Issues:6Issues:6

smolder

HL7 Apache Spark Datasource

Language:ScalaLicense:Apache-2.0Stargazers:57Issues:12Issues:9

feature-factory

Accelerator to rapidly deploy customized features for your business

Language:PythonLicense:NOASSERTIONStargazers:55Issues:8Issues:2

databricks-sync

An experimental tool to synchronize source Databricks deployment with a target Databricks deployment.

Language:PythonLicense:NOASSERTIONStargazers:46Issues:7Issues:63
Language:PythonLicense:NOASSERTIONStargazers:43Issues:5Issues:0

transpiler

SIEM-to-Spark Transpiler

Language:ScalaLicense:NOASSERTIONStargazers:42Issues:79Issues:16

brickster

R Toolkit for Databricks

Language:RLicense:Apache-2.0Stargazers:37Issues:2Issues:21

delta-oms

DeltaOMS is a solution that help build a centralized repository of Delta Transaction logs and associated operational metrics/statistics for your Delta Lakehouse. Unity Catalog supported in the v0.7.0-rc1 release.Documentation here - https://databrickslabs.github.io/delta-oms/v0.7.0-rc1/

Language:ScalaLicense:NOASSERTIONStargazers:37Issues:8Issues:9

splunk-integration

Databricks Add-on for Splunk

Language:PythonLicense:NOASSERTIONStargazers:26Issues:7Issues:18
Language:PythonLicense:NOASSERTIONStargazers:23Issues:8Issues:8

arcuate

Delta Sharing + MLflow for ML model & experiment exchange (arcuate delta - a fan shaped river delta)

Language:PythonLicense:NOASSERTIONStargazers:22Issues:3Issues:0

remorph

Cross-compiler and Data Reconciler into Databricks Lakehouse

Language:ScalaLicense:NOASSERTIONStargazers:22Issues:7Issues:310

databricks-sdk-r

Databricks SDK for R (Experimental)

Language:RLicense:Apache-2.0Stargazers:19Issues:5Issues:12
Language:Rich Text FormatLicense:NOASSERTIONStargazers:17Issues:5Issues:2

sandbox

Experimental or low-maturity things

Language:GoLicense:NOASSERTIONStargazers:16Issues:10Issues:8

blueprint

Baseline for Databricks Labs projects written in Python

Language:PythonLicense:NOASSERTIONStargazers:14Issues:5Issues:20

delta-sharing-java-connector

A Java connector for delta.io/sharing/ that allows you to easily ingest data on any JVM.

Language:JavaLicense:Apache-2.0Stargazers:13Issues:5Issues:2
Language:ScalaLicense:NOASSERTIONStargazers:12Issues:4Issues:3

lsql

Lightweight SQL execution wrapper only on top of Databricks SDK

Language:PythonLicense:NOASSERTIONStargazers:8Issues:3Issues:86

pylint-plugin

Databricks Plugin for PyLint

Language:PythonLicense:NOASSERTIONStargazers:8Issues:5Issues:27

waterbear

Automated provisioning of an industry Lakehouse with enterprise data model

Language:PythonLicense:NOASSERTIONStargazers:8Issues:4Issues:0