amoustakis/Estimation-of-vital-status-of-cancer-patients-using-Machine-Learning

google-colaboratory knearest-neighbor-classifier machine-learning pca python random-forest-classifier logistic-regression svm-classifier xgboost-classifier

This assignment (in the form of a research paper) was conducted as part of the module ‘Machine Learning’ for the MSc ‘Data Science and Machine Learning’. The objective was to test different Machine Learning models in order to accurately predict the vital status of patients with high grade serous ovarian cancer. The trained models implemented the classifiers K-Nearest-Neighbors, Support Vector Machine, Logistic Regression, Random Forest and XGBoost. The methodologies used were K-Nearest-Neighbors for filling missing values in the dataset, PCA and variance threshold for attribute selection, Min-Max scaling and Z-Score for normalization, 5-fold Cross Validation for the validation of the models and Grid Search for hyperparameter selection. The performance of the models was evaluated using the metrics Accuracy and Area Under the Curve (AUC).

About

Estimation of vital status of patients with ovarian cancer using Machine Learning models (K-Nearest-Neighbors, Support Vector Machine, Logistic Regression, Random Forest and XGBoost)

google-colaboratory knearest-neighbor-classifier machine-learning pca python random-forest-classifier logistic-regression svm-classifier xgboost-classifier

Languages

Language:Jupyter Notebook 100.0%