Hemaatchu / Implementation-of-Logistic-Regression-Model-to-Predict-the-Placement-Status-of-Student

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Implementation-of-Logistic-Regression-Model-to-Predict-the-Placement-Status-of-Student

AIM :

To write a program to implement the the Logistic Regression Model to Predict the Placement Status of Student.

Equipments Required :

  1. Hardware – PCs
  2. Anaconda – Python 3.7 Installation / Jupyter notebook

Algorithm :

  1. Import the required packages and print the present data.
  2. Print the placement data and salary data.
  3. Find the null and duplicate values.
  4. Using logistic regression find the predicted values of accuracy , confusion matrices.
  5. Display the results.

Program :

/*
Program to implement the the Logistic Regression Model to Predict the Placement Status of Student.
Developed by: HEMAVATHY S
RegisterNumber:  212223230076
*/


import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

# Load the data
data = pd.read_csv("Placement_Data.csv")

# Print the entire DataFrame
print("Placement Data:")
print(data)

# Print only the salary column (if it exists)
if 'salary' in data.columns:
    print("\nSalary Data:")
    print(data['salary'])
else:
    print("\n'Salary' column not found in DataFrame")

# Remove unnecessary columns (if any)
data1 = data.drop(["salary"], axis=1, errors='ignore')

# Check for missing values
print("\nMissing Values Check:")
print(data1.isnull().sum())

# Check for duplicate rows
print("\nDuplicate Rows Check:")
print(data1.duplicated().sum())

# Print the cleaned data
print("\nCleaned Data:")
print(data1)

# Initialize LabelEncoder
le = LabelEncoder()

# Encode categorical columns
categorical_columns = ['workex', 'status', 'hsc_s']  # List of categorical columns to encode
for column in categorical_columns:
    if column in data1.columns:
        data1[column] = le.fit_transform(data1[column])
    else:
        print(f"'{column}' column not found in DataFrame")

# Prepare features and target
x = data1.drop('status', axis=1, errors='ignore')  # Features
y = data1['status']  # Target

# Split the data
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)

# Train the model
lr = LogisticRegression(solver="liblinear")
lr.fit(x_train, y_train)
y_pred = lr.predict(x_test)

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
confusion = confusion_matrix(y_test, y_pred)
classification_report1 = classification_report(y_test, y_pred)

print("\nAccuracy:", accuracy)
print("Confusion Matrix:\n", confusion)
print("Classification Report:\n", classification_report1)

# Print the y_pred array
print("\nY Prediction Array:")
print(y_pred)

Output :

Placement Data :

image

Salary Data :

image

Checking the null() function :

image

Data Duplicate :

image

Clean Data :

image

Y-Prediction Array :

image

Missing Values Check :

image

Accuracy value :

image

Confusion array :

image

Classification Report :

image

Result :

Thus the program to implement the the Logistic Regression Model to Predict the Placement Status of Student is written and verified using python programming.

About

License:BSD 3-Clause "New" or "Revised" License