raulalmuzara / tumor-classification-pca-svm

Principal Component Analysis and Support Vector Machine for tumor classification in patients from the original Wisconsin breast cancer database.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Principal Component Analysis and Support Vector Machine for tumor classification

Analysis of the Breast Cancer Wisconsin (Original) Dataset

Original data in the UCI Machine Learning Repository: https://archive.ics.uci.edu/dataset/15/breast+cancer+wisconsin+original

Kaggle notebook: https://www.kaggle.com/raulalmuzara/pca-and-svm-for-tumor-classification

699 patients who may have a benign tumor (Class = 2) or a malignant tumor (Class = 4). In addition to a Sample Code Number, there are 9 biological features rated with integer values from 1 to 10: Clump Thickness, Uniformity of Cell Size, Uniformity of Cell Shape, Marginal Adhesion, Single Epithelial Cell Size, Bare Nuclei, Bland Chromatin, Normal Nucleoli and Mitoses. We will reduce them into 2 variables with Principal Component Analysis. Then, we will classify the patients with a Support Vector Machine. The objective is to predict the tumor class (benign or malignant) of a patient.

About

Principal Component Analysis and Support Vector Machine for tumor classification in patients from the original Wisconsin breast cancer database.


Languages

Language:Jupyter Notebook 100.0%