ShafeeqAhamedS / Mini-Project--Application-of-NN

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Prediction of Liver Cirrhosis

Project Description:

About Liver Cirrhosis

  • Chronic liver damage from a variety of causes leading to scarring and liver failure.
  • Hepatitis and chronic alcohol abuse are frequent causes.
  • Liver damage caused by cirrhosis can't be undone, but further damage can be limited.
  • Initially patients may experience fatigue, weakness and weight loss.
  • During later stages, patients may develop jaundice (yellowing of the skin), gastrointestinal bleeding, abdominal swelling and confusion.

About the dataset

  • This data set contains 416 liver patient records and 167 non liver patient records collected from North East of Andhra Pradesh, India.
  • The "Dataset" column is a class label used to divide groups into liver patient (liver disease) or not (no disease).
  • This data set contains 441 male patient records and 142 female patient records.

Goal of the Project

  • The main aim of the project is to create a ANN model which classifies patients as Infected or not infected based on various protiens in the blood.
  • By using the simple blood tests we can predict whether he is infected or not.

Algorithm:

  1. Import the Libraries.
  2. Read the Dataset.
  3. Check for Null Values, if there are any fill them.
  4. Check for duplicated values, if there are any remove them.
  5. Transform Categorical into Numerical values.
  6. Check Correlation Values for each feature.
  7. Drop UnCorrelated Featuers.
  8. Assign X and Y.
  9. Split Dataset into testing and training.
  10. Apply MLP Classifier and predict accuracy
  11. Analyze the metrics.
  12. Predict for a given input

Program:

Import the Libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

Read & Basic info about Dataset

df = pd.read_csv("./Liver.csv")
df
df.info()
df.describe()
df.columns

Check for Null Values & Remove them

df.isnull().sum()
df['Albumin_and_Globulin_Ratio'] = 
   df['Albumin_and_Globulin_Ratio'].fillna(df['Albumin_and_Globulin_Ratio'].mean())
df.isnull().sum()

Check for Duplicate Values & Remove them

print("Duplicate Values =",df.duplicated().sum())
df[df.duplicated()]
df=df.drop_duplicates()
df.duplicated().sum()

Encode Values

from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
df["Gender"] = le.fit_transform(df["Gender"])

df['Dataset']=df['Dataset'].map({1:1,2:0})
df

Correlation Values

plt.figure(figsize=(10,5))
df.corr()['Dataset'].sort_values(ascending=False).plot(kind='bar',color='black')
plt.xticks(rotation=90)
plt.xlabel('Variables in the Data')
plt.ylabel('Correlation Values')
plt.show()

df = df.drop(["Total_Protiens","Albumin","Albumin_and_Globulin_Ratio"],axis=1)
df

Assigning X and Y

X = df.drop(['Dataset'], axis=1)
X
y = df['Dataset']
y

Splitting Dataset

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.30,random_state=101)
print("Training sample shape =",X_train.shape)
print("Testing sample sample =",X_test.shape)

Creating MLP

from sklearn.metrics import accuracy_score,classification_report,confusion_matrix
from sklearn.neural_network import MLPClassifier

reg = MLPClassifier(hidden_layer_sizes=(8), learning_rate_init=0.0001, max_iter=10000)  
reg.fit(X_train, y_train)

log_predicted= reg.predict(X_test)

Testing Metrics

print('Accuracy: \n', accuracy_score(y_test,log_predicted))
print('Confusion Matrix: \n', confusion_matrix(y_test,log_predicted))
sns.heatmap(confusion_matrix(y_test,log_predicted),annot=True,fmt="d")
print('Classification Report: \n', classification_report(y_test,log_predicted))

Testing Custom Inputs

pred_0 = reg.predict([[25,0,0.1,0.1,44,4,8]])
pred_1 = reg.predict([[50,1,5,1,200,50,50]])
if(pred_0 == 1 or pred_1 ==1):
  print("Infected with Liver Cirrohisis")
else:
  print("Not Infected with Liver Cirrohisis")

Output:

Read & Basic info about Dataset

Dataset

image

Info



Descrption

image

Columns

image

Check for Null Values & Remove them

Null Value - Before Removing

image

Null Value - After Removing

image

Check for Duplicate Values & Remove them

Total Duplicate Values

image image

Duplicate Values - After Removing

image

Encode Values

Afer Encoding

image















Correlation Values

Correlation

image

Dataset after dropping uncorrelated values

image





Splitting Dataset

Training and testing size

image

Testing Metrics

Accuracy

image

Confusion Matrix

image

image

Classification Report

image

Testing Custom Inputs

Normal Levels

  • Total bilirubin: 0.1 to 1.2 mg/dL
  • Direct bilirubin: less than 0.3 mg/dL
  • Alkaline_Phosphotase -44 to 147 international units per liter
  • Alamine_Aminotransferase - 4 to 36 U/L
  • Aspartate_Aminotransferase - 8 to 33 U/L.

Test -1

  • Age = 25
  • Gender = 0
  • Total_Bilirubin = 0.1
  • Direct_Bilirubin = 0.1
  • Alkaline_Phosphotase = 44
  • Alamine_Aminotransferase = 4
  • Aspartate_Aminotransferase = 8

image

Test- 2

  • Age = 50
  • Gender = 1
  • Total_Bilirubin = 5
  • Direct_Bilirubin = 1
  • Alkaline_Phosphotase = 200
  • Alamine_Aminotransferase = 50
  • Aspartate_Aminotransferase = 50

image

Advantage :

  • This model is very helpful in predicting Liver Cirrohsis with a Blood Test only.
  • Usually it invloves MRI or Scan to make sure.
  • Thus it makes the test cost effective and more guaranteed.
  • 75% is a good accuracy score and it can further be increased by using certain Hyperparameters and Regularizing the ANN.
  • These measures can be implemented in the next steps and our model will be more accuracte.

Result:

Thus a MLP is trained to classify whether a patient is infected with Liver Cirrohsis or Not based various blood test results with nearly 75%(74.269%) accuracy Refer Colab File HERE

A Project By:

Shafeeq Ahamed.S - 212221230092

Sanjay Kumar.S.S - 212221240048

About