To write a program to implement the the Logistic Regression Model to Predict the Placement Status of Student.
- Hardware – PCs
- Anaconda – Python 3.7 Installation / Jupyter notebook
- Import the required packages and print the present data.
- Print the placement data and salary data.
- Find the null and duplicate values.
- Using logistic regression find the predicted values of accuracy , confusion matrices.
- Display the results.
/*
Program to implement the the Logistic Regression Model to Predict the Placement Status of Student.
Developed by: HEMAVATHY S
RegisterNumber: 212223230076
*/
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
# Load the data
data = pd.read_csv("Placement_Data.csv")
# Print the entire DataFrame
print("Placement Data:")
print(data)
# Print only the salary column (if it exists)
if 'salary' in data.columns:
print("\nSalary Data:")
print(data['salary'])
else:
print("\n'Salary' column not found in DataFrame")
# Remove unnecessary columns (if any)
data1 = data.drop(["salary"], axis=1, errors='ignore')
# Check for missing values
print("\nMissing Values Check:")
print(data1.isnull().sum())
# Check for duplicate rows
print("\nDuplicate Rows Check:")
print(data1.duplicated().sum())
# Print the cleaned data
print("\nCleaned Data:")
print(data1)
# Initialize LabelEncoder
le = LabelEncoder()
# Encode categorical columns
categorical_columns = ['workex', 'status', 'hsc_s'] # List of categorical columns to encode
for column in categorical_columns:
if column in data1.columns:
data1[column] = le.fit_transform(data1[column])
else:
print(f"'{column}' column not found in DataFrame")
# Prepare features and target
x = data1.drop('status', axis=1, errors='ignore') # Features
y = data1['status'] # Target
# Split the data
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)
# Train the model
lr = LogisticRegression(solver="liblinear")
lr.fit(x_train, y_train)
y_pred = lr.predict(x_test)
# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
confusion = confusion_matrix(y_test, y_pred)
classification_report1 = classification_report(y_test, y_pred)
print("\nAccuracy:", accuracy)
print("Confusion Matrix:\n", confusion)
print("Classification Report:\n", classification_report1)
# Print the y_pred array
print("\nY Prediction Array:")
print(y_pred)
Thus the program to implement the the Logistic Regression Model to Predict the Placement Status of Student is written and verified using python programming.