Jash Shah's repositories
RealTime-Streaming-Data-Pipeline-with-Kafka-and-GCP
A comprehensive implementation of a real-time streaming data pipeline using Apache Kafka and Google Cloud Platform (GCP) for efficient data ingestion, processing, and analysis.
Time-Series-Forecasting-of-Macro-Economic-Parameters
Time Series Analysis of Macro Economic Parameters using Vector Auto Regression Model
-API-to-Snowflake-with-Airflow
Leverage Apache Airflow to effortlessly extract data from Alpha Vantage API and load it into Snowflake, streamlining the process of integrating financial market data into your Snowflake data warehouse.
Automating-EMR-Cluster-using-AWS-Lambda
Automate Amazon EMR clusters using Lambda for streamlined and scalable data processing workflows. Unlock the full potential of your data pipeline with LambdaEMR Automator.
AWS-Big-Data-Pipeline-orchestrated-with-Airflow
A robust data pipeline leveraging Amazon EMR and PySpark, orchestrated seamlessly with Apache Airflow for efficient batch processing
AWS-EMR-based-Recommendation-Engine-Pipeline
Build and deploy a scalable Recommendation Engine leveraging AWS EMR, enabling efficient processing and analysis for personalized recommendations in large datasets.
HealthSynergy-Hospital-Data-Visualization
Data Visualization of a Healthcare Administration Dataset
Large-Language-Models-for-Medical-Data-Extraction
Using LLMS to extract medical Data for Doctors and Hospitals
Gaussian-Mixture-Models-Implementation-from-Scratch
Python codebase that enables you to build and utilize GMMs without relying on external libraries or pre-built functions
Hierarchical-Clustering-and-PCA
Python Codebase that implements hierarchical clustering and effects of PCA on clustering
Integrating-Airflow-with-Snowflake
Automate data workflows seamlessly by integrating Apache Airflow with Snowflake, enabling efficient orchestration and management of data pipelines in the Snowflake data warehouse environment. Streamline data processing and enhance collaboration across your analytics infrastructure.
K-Means-and-K-means-plus-plus
Python codebase that implements K means and K means ++ Algorithm from Scratch
LinearRegressionProject
Implementation of end to end Machine Learning Linear Regression Project
Miners-of-Wallstreet
Stock market prediction using various machine learning techniques and sentiment analysis over the news and tweets
POC
Summarization POC using Gemini
Principal-Component-Analysis-from-Scratch
A Python implementation of Principal Component Analysis (PCA) from scratch, allowing for in-depth exploration and customization of the dimensionality reduction technique
Shipmate
Online delivery management system to automate all the manual processes in delivery management.
Sleep-Efficiency-Classifier
Exploratory Data Analysis of Sleep Efficiency Classifier
SQL-Server-Data-Migration-using-Azure
Created a pipeline to perform data migration from on premise SQL server management studio to Azure cloud
Transfer-Learning
Using State of Art transformers for text classification and deep CNNs for Image Classification