This repository contains a Python script for building a predictive model to classify breast cancer as benign or malignant using logistic regression. The code includes the following steps:
-
Importing Dependencies: The necessary libraries, including NumPy, Pandas, and scikit-learn, are imported.
-
Data Collection & Processing: The breast cancer dataset is loaded from scikit-learn, transformed into a Pandas DataFrame, and processed. Descriptive statistics, information about the data, and checks for missing values are performed.
-
Separating Features and Target: The features (X) and target variable (Y) are separated.
-
Splitting Data: The dataset is split into training and testing sets (80% training, 20% testing).
-
Model Training: A logistic regression model is trained using the training data.
-
Model Evaluation: The accuracy of the model is evaluated on both the training and testing data.
-
Building a Predictive System: A function is provided to predict the label (benign or malignant) based on input features. An example of usage is demonstrated.
click here to try it on colab : https://colab.research.google.com/drive/1M6qfonnaUzLlhMf3n-wUXWvTl5OkWX7M?usp=sharing
The link of the data that i used : https://www.kaggle.com/datasets/uciml/breast-cancer-wisconsin-data
Feel free to explore the code and utilize it for breast cancer classification tasks. If you have any questions or suggestions, please feel free to reach out.