Stroke Prediction
In this project, we attempt to classify patients to find out if they will have a stroke or not.
Dataset
This dataset can be obtained from Kaggle and can be found in this link. In the dataset, we have 5110 datapoints (rows) and each row provides the following information on each patient:
id
: unique identifiergender
: "Male", "Female" or "Other"age
: age of the patienthypertension
: 0 if the patient doesn't have hypertension, 1 if the patient has hypertensionheart_disease
: 0 if the patient doesn't have any heart diseases, 1 if the patient has a heart diseaseever_married
: "No" or "Yes"work_type
: "children", "Govt_jov", "Never_worked", "Private" or "Self-employed"Residence_type
: "Rural" or "Urban"avg_glucose_level
: average glucose level in bloodbmi
: body mass indexsmoking_status
: "formerly smoked", "never smoked", "smokes" or "Unknown"*stroke
: 1 if the patient had a stroke or 0 if not *Note: "Unknown" in smoking_status means that the information is unavailable for this patient
Repo Structure
dataset/healthcare-dataset-stroke-data
: full dataset downloaded from kaggleMakefile
: Makefile where you can create a virtual env or create a docker container to run the notebookDockerfile
: Dockerfile to build a docker image with GPU support (tested on NVIDIA TITAN RTX with Ubuntu 18.04). More information on how to download docker can be found here. nvidia-docker installation can also be found hereStrokePrediction.ipynb
: Notebook containing all the codepoetry.lock
/pyproject.toml
: Python Poetry files to install a development environment with all dependencies handledenvrionment.yml
: YAML file containing conda exported environment for development (you can use either poetry or conda)
Running the Notebook
Clone and cd
in the repository before running any of the commands:
git clone git@github.com:charbel-a-hC/ups-ml-stroke-prediction.git
cd ups-ml-stroke-prediction
You also need to install python3.8
locally if you wish to run the notebook on a local environment. For Ubuntu:
sudo apt-get install python3.8 \
python3-pip \
python3.8-venv \
python3.8-dev \
python3.8-distutils
And you need to update your pip
:
/usr/bin/python3.8 -m pip install --upgrade pip
Docker
If you have docker installed:
docker build . -t ups-ml-stroke-prediction
docker run -it --rm -v --runtime=nvidia ${PWD}:/ups-ml-stroke-prediction ups-ml-stroke-prediction bash
After launching the container, a notebook will be open in the following ip/port; localhost:8888
.
Local Environment (Ubuntu-18.04) - Poetry
Simply run the make command in the repository:
make env
A virtual environment will be created after running the above command. In the same shell, run:
poetry shell
This will activate the environment and then you can open a jupyter notebook:
jupyter notebook --ip 0.0.0.0 --port 8888 --allow-root
Local Environment - Conda
You can download Anaconda here. After the download, open an anaconda navigator prompt if you're on windows and run the following commands:
conda env create -f environment.yml
conda activate ml
Note: If you're on Linux, you can open a normal terminal and run the following command before creating the environment:
conda activate base
You can then open a jupyter notebook similarly:
jupyter notebook