Pavan Kulkarni's repositories
MultiCurrencyPiggyBankCalculator
This is a small project for y kid who has a vast collection of coins of different currencies in her Piggy Bank
sparkMeasure
This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination of Spark metrics, making it a practical choice for both developers and data engineers.
bhai
Explore this fun language --> bhailang
PythonDocker
A basic repo to build and deploy simple Flask App on Docker
data-engineer-learning-path
Databricks Spark Materials As Retired from Databricks Academy
pavanpkulkarni
Config files for my GitHub profile.
training-kit
Open source cheat sheets for Git and GitHub
SearchEngine_ES_Flask
This project is a quick deomo of building search engine using Elasticsearch
TestGitCommands
This repo is get handon on Git Commands
The-Documentation-Compendium
📢 Various README templates & tips on writing high-quality documentation that people want to read.
ScalaTestWorkspace
This repo contains solutions to Coding Challenge
Read_Write_HDFS_Spark_WordCount
Read_Write_HDFS_Spark_WordCount
KafkaFile_BatchProcessing
Project to demo file processing w/ Avro schema in Scala using gradle
Spark_WordCount_Gradle
This repo spark wordcount code using Gradle build tool
Docker_WordCount_Spark
Sample spark program to run in docker setup
Spark_Mongo_Example
This repo contains mongo spark sample code in Scala
Spark_Cassandra_Example
Sample code to demo spark cassandra connector (Spark v. 2.x; Cassandra v. 3.x)
Spark_Streaming_Examples
This repo contains spark structured streaming examples in Scala
spring-batch
spring-batch example projects
CreditCard_Fraud_Detection
Spark MLLib Application for Credit Card Fraud Detection - Structured Streaming
MongoDB_Python
Repo to demo insert and delete operations for MongoDB in Python
create-and-run-spark-job
Create n-node cluster and Run spark job on Docker
docker-spark-image
This repo contains docker image for Spark 2.2.1 cluster
PySpark_WordCount
Repo to demo basic WordCount in PySpark using PyCharm
Spark_Cassandra_Python
Repo to demo basic WordCount in PySpark using PyCharm
pavanpkulkarni.github.io
Test wen hosting on github
blog
This repo holds all the blog content
Topic_Classification
This repo contains python project to show topic classification using LDA for Topic Modeling and then applying Word2Vec for classifying the topics.