Kushal Bohra's repositories
airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
airflow-essentials
Materials for Airflow training
airflow_project
scaffold of Apache Airflow executing Docker containers
atlas
Apache Atlas
awesome-apache-airflow
Curated list of resources about Apache Airflow
azure-quickstart-templates
Azure Quickstart Templates
book-project
Book tracker web app
brickhouse
Hive UDF's for the data warehouse
coral
Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.
delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
distsys-class
Class materials for a distributed systems lecture series
hadoop
Mirror of Apache Hadoop
hive
Mirror of Apache Hive
introduction_to_ml_with_python
Notebooks and code for the book "Introduction to Machine Learning with Python"
jumbune
Jumbune is an open-source Proactive ML based BigData platform performance accelerator & automated data quality management platform. Commercial offering is available at http://jumbune.com. More details of open source offering are at,
jvm-profiler
JVM Profiler Sending Metrics to Kafka, Console Output or Custom Reporter
kafka-connect-hdfs
Kafka Connect HDFS connector
llama-hub
A library of data loaders for LLMs made by the community -- to be used with GPT Index and/or LangChain
NNAnalytics
NameNodeAnalytics is a self-help utility for scouting and maintaining the namespace of an HDFS instance.
overwatch
Capture deep metrics on one or all assets within a Databricks workspace
presto
The official home of the Presto distributed SQL query engine for big data
salt
Software to automate the management and configuration of any infrastructure or application at scale. Get access to the Salt software package repository here:
spark
Mirror of Apache Spark
sparklint
A tool for monitoring and tuning Spark jobs for efficiency.
statusTracker
Monitor status for cloud services using python based application
stocksight
Crowd-sourced stock analyzer and predictor using Elasticsearch, Twitter, News headlines and Python natural language processing and sentiment analysis
yt-dlc
media downloader for various sites.