amrutdeshpande

Amrut Deshpande's repositories

Water-Pump-Machine-Failure-prediction

1. Created an automated machine failure prediction solution that monitors data coming in every day and historical data and predicts the chances of failure 30 days in advance to enable the respective stake holders to take necessary action to minimize the loss . 2. Extract the Time series data from MS SQL Server by establishing a link using python, clean the data for discrepancies (outliers, class imbalance etc.), perform EDA and define baseline metrics . 3. Built & trained Machine learning models like LSTM, RNN, Decision Trees and optimized their parameters to increase their accuracy and execution efficiency. The prediction result is then pushed into a UI built on Visual Studio & Tableau Dashboard to enable for easy consumption by the stake holders.

Language:Jupyter Notebook300

Design-of-Experiments-in-Identification-of-the-optimal-ingredient-mix-for-Pancakes-

Ran a randomized 2^4 factorial experiment to ascertain the factors affecting optimized ingredient mix using JMP and Minitab. Carried out Statistical testing through ANOVA, Parameter estimators, residual analysis, summary of fit and by plotting response surfaces

100

Prediction-of-Loan-Status-using-Machine-Learning-Classifiers

Built visualizations and obtained statistical inferences to understand the relation between the independent variables and the Loan Status target variable. Handled the Big data using SGD Classifier partial fit with best tuning parameters and adopting PCA to reduce the dimensionality. Fitted the Chunk data obtained with varied Classifiers such as SVM, Logistic regression and Random Forest to obtain the best F1 Score,AUC, Accuracy.

Language:Jupyter Notebook100

Research-Volunteer-under-Professor-Hao-Yan

Some of my works as research volunteer

Language:Jupyter Notebook100

A-Case-Study-to-determine-the-most-effective-Production-system-for-new-Manufacturing-Company--Hassis

Hassis Games is a company which is set for the production of two board games Atlantic City and Reward. Having already known their yearly demand, the challenge is to access the situation and to determine the most effective production system to meet the deterministic demand of both the board games. In order to meet the yearly demand, the company needs to estimate several requirements. Firstly, it needs to estimate the machines required for all the individual component manufacturing by using its process times and the quantity of machines within a cell. In the due course the inventory levels of the components also has to be calculated. Secondly, the facility layout has to be designed by considering the given square footage of each machine, miscellaneous tools, storage, inventory and maneuvering space. The prorated financial cost of machine for a five year time period has to be keenly evaluated for a better layout. Finally, the labor cost has to be calculated by keeping changeovers, inventory management and material transfer tasks in the cost analysis. Additionally machine cost and facility cost must also be calculated in order to have an overall cost estimate of the company.

000

Computer-Hardware-Performance-Evaluation

The primary intention of the project is to find all the influential parameters for getting the best response variable. Thorough data analysis must be conducted by checking for data transformation, model adequacy, multicollinearity, Test of significance, variable selection and model validation on the initial model. The Final model should be able to provide good fit with predicting capability of all the future observations.

Language:R000

Decision-Support-for-a-Two-Stage-Multi-Product-System

The base idea imparted into people when they hear supply chain or supply chain management is the movement and storage of products. These products may be finished products or raw materials from the point of origin to point of consumption. In general terms this would be the definition of supply chain management. In a more defined way, the definition of it would be as design, planning, execution control and monitoring of supply chain activities with the prime objective of creating net profit for the system.

000

Exploratory_Data_Analysis_WhiteWineData

A thorough study on the variables and its impact on the prediction of the quality of White wine data through Exploratory Data Analysis.

Language:HTML000

Visualising-Flight-Delay-using-Tableau

I obtained my dataset of Flight delays from RITA website which contained information on flight delays and the performances.

000

Wrangle-WeRateDogs-twitter-s-data

Data Wrangling is a widely used technique in the field of Data science and it’s a very crucial part before any analysis is been conducted on the data set. It consists of three phases namely, Gathering, Assessing and Cleaning. In this project I performed wrangling on the data of @weratedogs twitter handle.

Language:Jupyter Notebook000

Amrut-Deshpande

000

amrutdeshpande.github.io

Portfolio

Language:HTML000

Certifications

A place to store all certificates

000

Create-Customer-Segmentation-Report-for-Arvato-Financial-Services

Processed, transformed and accessed demographic data of the both general population and mail-order sales company to identify key features. Used unsupervised learning techniques, PCA and k-NN to perform customer segmentation and to identify the core customer traits of the company. Trained and accessed varied supervised learning models based on ROC-AUC with gradient boosting regressor performing the best among all. GridSearchCV was used to tune each parameter with 5 stratified folds which obtained the final mean AUC score of 0.7618

000

datasciencecoursera

000

datasharing

The Leek group guide to data sharing

000

Deploy-a-Sentiment-Analysis-Model

Processed and prepared the IMDB data using stopwords and BeautifulSoup packages, built a word-dict and converted the word-dict by padding. Built and trained Pytorch LSTM Classifier Model using the train-test data stored in S3 and ml.m4.xlarge EC2 instance type in Amazon Sagemaker. Deployed the tested model with accuracy score of 0.864 and made available via through API Gateway using IAM role and AWS Lambda function

Language:Jupyter Notebook000

findat

findata

Language:Jupyter Notebook000

HackerRank---SQL-Solutions

https://www.hackerrank.com/larkinjw

000

handson-ml

A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in python using Scikit-Learn and TensorFlow.

Language:Jupyter NotebookApache-2.0000

Identification-of-the-optimal-ingredient-mix-for-Pancakes

To measure the perceived quality of pancakes given controlled variations in the ingredients used. Over the past few months all of us tended to cook food on our own and whenever we made pancakes we noticed that all the pancakes looked different each and every time we made them. Upon delving into this occurrence, we noticed that for minor variations in our ingredients we ended up getting such highly perceivable differences in our end products. That's when we realised that ideally, an optimal mix of the ingredients for pancakes should exist such that for a given sample size the bulk of them would prefer them.

000

kaggle-titanic

A tutorial for Kaggle's Titanic: Machine Learning from Disaster competition. Demonstrates basic data munging, analysis, and visualization techniques. Shows examples of supervised machine learning techniques.

Language:Jupyter NotebookApache-2.0000

Statistical-Learning-for-Data-mining-and-classification

Objective: To build a classification model for the given data set based on the available classifiers in Weka, to explore the classifier settings and to learn and interpret the classification results.

000

stockmarketanalysis

Using the power of Big Data Tools to analyze Stock Market

Language:Java000

tedsds

Apache Spark - Turbofan Engine Degradation Simulation Data Set example in Apache Spark

Language:Jupyter NotebookApache-2.0000

Test-a-Perceptual-Phenomenon

Analyzing the Stroop Effect. T statistic = d-bar/(Sd/sqrt(n)) where d-bar is the average difference between Incongruent and congruent values, Sd is the standard deviation of the population and n is the sample size. Upon calculating, it is approximately 8.02. t-distribution with Degree of freedom 23 and at alpha 0.05, we get t as 1.714 and a very less p-value < 0.05 This gives sufficent clarity and confidence to reject the null hypothesis which states mean of Congruent and Incongruent time are equal. Thus the alternative hypothesis is considered concluding that the Incongruent mean values are greater than congruent values. The results match with the expectations.

Language:Python000

amrutdeshpande

Amrut Deshpande's repositories

Water-Pump-Machine-Failure-prediction

Design-of-Experiments-in-Identification-of-the-optimal-ingredient-mix-for-Pancakes-

Prediction-of-Loan-Status-using-Machine-Learning-Classifiers

Research-Volunteer-under-Professor-Hao-Yan

A-Case-Study-to-determine-the-most-effective-Production-system-for-new-Manufacturing-Company--Hassis

Computer-Hardware-Performance-Evaluation

Decision-Support-for-a-Two-Stage-Multi-Product-System

Exploratory_Data_Analysis_WhiteWineData

Visualising-Flight-Delay-using-Tableau

Wrangle-WeRateDogs-twitter-s-data

Amrut-Deshpande

amrutdeshpande.github.io

Certifications

Create-Customer-Segmentation-Report-for-Arvato-Financial-Services

datasciencecoursera

datasharing

Deploy-a-Sentiment-Analysis-Model

findat

HackerRank---SQL-Solutions

handson-ml

Identification-of-the-optimal-ingredient-mix-for-Pancakes

kaggle-titanic

Statistical-Learning-for-Data-mining-and-classification

stockmarketanalysis

tedsds

Test-a-Perceptual-Phenomenon

testing

Titanic-Machine-Learning

udemy-ML-Bootcamp