harshraj11584 / MA2142-RegressionAnalysis-FIFA2019-OverallRating

Project for MA2142 Regression Analysis : Predicting Overall Rating for FIFA2019 Dataset using only Linear Regression based techniques. RMSE = 0.11

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RegressionAnalysis-FIFA2019-OverallRating

Predicting Overall Rating for FIFA 2019 Dataset using only Linear Regression based techniques

Generic badge

Link to DataSet

Used All Numerical Features except Potential Rating for predicting Overall Rating.

Scores :

Adjusted R^2 = 0.90
RMSE = 0.11

Link to Kaggle R Notebook

Visualize first 100 Predictions :

alt text

Process :

Removed Multicollinear Features, tested Global and Individual Significance Hypotheses.
Residual Analysis Plots - 1. Residuals vs Fitted 2. QQ Plot 3. Scale Location 4. Residuals vs Leverage
Residuals Look like Normal Distribution, but Shapiro-Wilk test rejects it. Shown that this is expected by simulating Shapiro-Wilk Test that rejects Random Normal Variables with slight deviation for n=5000 dataset.
Used the BoxCox transformation on y to stabilize variance, removing Outliers detected using Leverage, Cook's Distance, DFBETAS, DFFITS, and COVRATIO techniques, then retrained model.
Also implemented olssr Model Selection Techniques- Backward Elimination, Stepwise Elimination, Step Back AIC, Step Forward AIC.

About

Project for MA2142 Regression Analysis : Predicting Overall Rating for FIFA2019 Dataset using only Linear Regression based techniques. RMSE = 0.11


Languages

Language:R 100.0%