FahedKaddou / ML_Classification_Logistic_Regression

Use scikit Logistic Regression to classify

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ML_Classification_Logistic_Regression

Use scikit Logistic Regression to classify

Table of contents

  • Load Data From CSV File
  • Data pre-processing and selection
  • Train/Test dataset
  • Modeling (Logistic Regression with Scikit-learn)
  • Prediction
  • Evaluation (jaccard index, confusion matrix, log loss)

Objectives

  • Use scikit Logistic Regression to classify
  • Understand confusion matrix

About the dataset

We will use a telecommunications dataset for predicting customer churn. This is a historical customer dataset where each row represents one customer. The data is relatively easy to understand, and you may uncover insights you can use immediately. Typically it is less expensive to keep customers than acquire new ones, so the focus of this analysis is to predict the customers who will stay with the company. This data set provides information to help you predict what behavior will help you to retain customers. You can analyze all relevant customer data and develop focused customer retention programs.

The dataset includes information about:

  • Customers who left within the last month – the column is called Churn
  • Services that each customer has signed up for – phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies
  • Customer account information – how long they had been a customer, contract, payment method, paperless billing, monthly charges, and total charges
  • Demographic info about customers – gender, age range, and if they have partners and dependents

Downloading the Data

To download the data, we will download it from IBM Object Storage. 'https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-ML0101EN-SkillsNetwork/labs/Module%203/data/ChurnData.csv'

About

Use scikit Logistic Regression to classify


Languages

Language:Jupyter Notebook 100.0%