Dipesh Vora's repositories
spark-xml
XML data source for Spark SQL and DataFrames
spark-property-tests
Write property based tests easily on spark dataframes
spark-dynamodb
Plug-and-play implementation of an Apache Spark custom data source for AWS DynamoDB.
sbt-release
A release plugin for sbt
databricks-api
Simple client for databricks rest api
mlflow-docker
Production ready docker-compose configuration for ML Flow with Mysql and Minio S3
s3-sqs-connector
A library for reading data from Amzon S3 with optimised listing using Amazon SQS using Spark SQL Streaming ( or Structured streaming).
spark-scala-k8-app
A sample on showing how to deploy the Spark Scala code on Kubernetes using spark-ink8s-operation
java-design-patterns
This repo contains examples of Java Design Patterns
feature-selection-logistic-regression
Feature Selection and Logistic Regression on Spam dataset
tiny
tiny
eda
EDA of automobile data
mutual-fund-returns
Predict the mutual fund returns in terms of bond spread
ga-learner-dsmp-repo
A collection of projects as part of the Data Science Masters Program at GreyAtom EduTech Pvt Ltd
olympic-hero
Olympic Hero
data-wrangling-pandas-code-along
This repository contains code for the code along session for the concept Data Wrangling with Pandas
data-visualization-matplotlib
This repository contains code for the code along session for the concept Data Visualization
manipulating-data-with-numpy-code-along
This repository contains code for the code along session for the concept Manipulating Data with NumP
handling-program-flow-in-python-code-along
This repository contains code for the code along session for Handling program Flow in Python Code
dsmp-pre-work
Code repository for the Pre Work program at GreyAtom
getting-started-python-code-along
This repository contains code for the code along session for the concept Getting Started with Python
hive-scd-examples
How to manage Slowly Changing Dimensions with Apache Hive
spark-integration-tests
Integration tests for Spark