The aim of this project is to attempt to predict whether or not an individual will suffer a stroke.
First, I will perform extensive data visualization. This will help me to see if there are any features that look to be indicative of a stroke, or indeed of not having a stroke.
Next, I will build multiple models and select the best performing one. I will use f1 score as my primary metric as our dataset is imbalanced (though I will also resolve this with SMOTE).
I will also delve in to Model Interpretation This is incfedibly important in industry. Often we need to explain very technical algorithms to a non-technical audience, so any tool that can help this process should be mastered.
Credit is given to Josh from Kaggle