deepak525 / Breast-Cancer-Visualization-and-Classification

This analysis aims to observe which features are most helpful in predicting malignant or benign cancer and to see general trends that may aid us in model selection and hyper parameter selection.

Repository from Github https://github.comdeepak525/Breast-Cancer-Visualization-and-ClassificationRepository from Github https://github.comdeepak525/Breast-Cancer-Visualization-and-Classification

Breast_Cancer

The Breast Cancer (Wisconsin) Diagnosis dataset contains the diagnosis and a set of 30 features describing the characteristics of the cell nuclei present in the digitized image of a of a fine needle aspirate (FNA) of a breast mass.

Ten real-valued features are computed for each cell nucleus:

  • radius (mean of distances from center to points on the perimeter);
  • texture (standard deviation of gray-scale values);
  • perimeter;
  • area;
  • smoothness (local variation in radius lengths);
  • compactness (perimeter^2 / area - 1.0);
  • concavity (severity of concave portions of the contour);
  • concave points (number of concave portions of the contour);
  • symmetry;
  • fractal dimension (“coastline approximation” - 1).

The mean, standard error (SE) and “worst” or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features.

We will analyze the features to understand the predictive value for diagnosis. We will then create models using two different algorithms and use the models to predict the diagnosis.

About

This analysis aims to observe which features are most helpful in predicting malignant or benign cancer and to see general trends that may aid us in model selection and hyper parameter selection.


Languages

Language:Jupyter Notebook 99.9%Language:Python 0.1%