Arch Desai's repositories
Customer-Survival-Analysis-and-Churn-Prediction
In this project, I have utilized survival analysis models to see how the likelihood of the customer churn changes over time and to calculate customer LTV. I have also implemented the Random Forest model to predict if a customer is going to churn and deployed a model using the flask web app.
Predictive-Maintenance-of-Aircraft-Engine
In this project I aim to apply Various Predictive Maintenance Techniques to accurately predict the impending failure of an aircraft turbofan engine.
Instacart-Market-Basket-Analysis
The objective of this project is to analyze the 3 million grocery orders from more than 200,000 Instacart users and predict which previously purchased item will be in user's next order. Customer segmentation and affinity analysis are done to study customer purchase patterns and for better product marketing and cross-selling.
News-Articles-Recommendation
Objective of the project is to build a hybrid-filtering personalized news articles recommendation system which can suggest articles from popular news service providers based on reading history of twitter users who share similar interests (Collaborative filtering) and content similarity of the article and user’s tweets (Content-based filtering).
Wind-Turbine-Power-Curve-Estimation
In this project, I have employed various regression techniques to estimate the Power curve of an on-shore Wind turbine. Nonlinear trees based ensemble regression methods perform best as true power curve is nonlinear. I have implemented and optimized XGBoost using GridSearchCV that yields lowest Test RMSE-6.404.
Machine-Predictive-Maintenance-PdM
In this project I aim to apply predictive maintenance techniques over 100MB of historical data from twenty of the units of a company that failed in the field. My objective is to see if there is a similarity in information of the units who had longest lives or shortest lives and to predict which active units will fail soon.
Hourly-Energy-Consumption-Prediction
In this project I used novel models such as XgBoost and Fbprophet on the hourly energy consumption data to accurately predict energy usage in the future. Features are extracted from timestamps to find trends on daily, weekly, monthly, quarterly and yearly basis and Fbprophet model's performance is improved by incorporating public holidays in the analysis.
DS-Challenges
This repository contains codes of online python/ML/AI/Statistics challenges I have solved.
Loan-Default-Prediction
In this project I applied various classification models such as Logistic Regression, Random Forest and LightGBM to accurately detect and classify consumers who will default the loan. SMOTE technique is used to combat class imbalance and LightGBM is implemented that resulted into the highest accuracy 98.89% and 0.99 F1 Score.
Multivariate-Phase-1-Analysis
Objective of this project is to identify the in-control data points and eliminate out of control data points to set up distribution parameters for manufacturing process monitoring. I utilized PCA for dimension reduction and Hotelling T2 and m-CUSUM control charts to established mean and variance matrices.
Drone-Flights-Analysis
The objective of this project is to perform independent exploratory data analysis & visualization of drone flights data in order to find hidden trends, patterns, and anomalies.
Statistical-Methods
Statistical methods
archd3sai.github.io
My Personal Website
CodeSnippets
This repository contains code snippets I use on daily basis.
LSTM-Best-Practices
This repository contains a method to develop a LSTM model for any task in a more efficient way using thumb rules..
Predicting-GDP-of-India
Objective of this project is to perform predictive assesment on the Gross Domestic Product of India through an inferential analysis of various socio-economic factors to find out which predictors contribute most to the GDP. Various models are compared and Stepwise Regression model is implemented which resulted in 5.7% Test MSE.
Ranking-of-NFL-Teams-using-Markov-method
In this project I implemented and compared three stationary distribution of Markov-chain based approaches to rank 32 NFL (National Football League) teams from "Best" to "Worst" using the scores of 2007 NFL regular season.
Stat-689-Assignments
This repository contains all assignments completed by me as a part of the academic course- Stat 689: Statistical Computation with Python
Summary-of-Research-Papers
This repository contains the summary of various research papers I have read. Feel free to fork and collaborate.
Tennis-Players-Ranking
Objective of this project is to rank all Tennis Players based on the matches they played in the year of 2018. Statistics of all matches are given including their scores in all the sets of all matches. This project comprises 4 approaches to rank Tennis players and I have tried to make these approaches more robust sequentially.