Najrul-Ansari / Classification-Models

Explore a collection of classification machine learning models implemented in Python. This repository showcases a variety of algorithms applied to different datasets, demonstrating their effectiveness in solving classification problems.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Classification-Models

Bank Term Deposit Subscription Prediction

Project - 7 - Bank_prj_7

  • Dataset name - bank-full.csv is the raw data and bank-full_cleaned.csv is the processed data for model. Both the dataset are in the dataset.zip file.

  • Objective - Whether the client has subscribed a term deposit or not.

  • Tools - Python code in Jupyter notebook, Powerpoint, Excel.

  • Algorithm Used - Logistic Regression

    • Logistic Regression - Logistic regression is a statistical method used for binary classification, which means predicting the probability of an observation belonging to one of two classes. Despite its name, logistic regression is a classification algorithm, not a regression algorithm. It is widely used in machine learning and statistics for tasks such as spam detection, medical diagnosis, and credit scoring.
  • Methodology:

    • Data Collection - The data used to build the model was provided in the csv format. It has gender, age, salary and whether the said person purchased the phone or not in 1 and 0 format, where 1 stands for yes and 0 for no.

    • Data Pre-processing - Data pre-processing is a crucial step to ensure the quality and suitability of the dataset for training machine learning models.

    • Feature Selection - Feature selection is a critical step to identify the most relevant variables that contribute to the predictive power of the model.

    • Model Selection - In the model selection section, provide a detailed overview of the machine learning algorithms chosen for the predictive analysis. Explain the rationale behind the selection of each algorithm and discuss how they align with the project objectives.

    • Model Training - In the model training section, the processed data is fit to train the selected model so that it is able to predict the future entered data.

    • Model Evaluation - In the model evaluation section, the performance of the trained machine learning models is assessed to select the best suited model for deployment.

iPhone Purchase Prediction

Project 8 - iphone_purchase_prj8

Iphone Purchases are getting increased day by day and many stores wants to predict whether a customer will purchase an Iphone from thier store given their gender, age and salary.

  • Dataset name - iphone_purchase_records.csv is the raw data and iphone_purchase_records_cleaned.csv is the processed data for model. Both the dataset are in the dataset.zip file.

  • Objective - Whether Customer will purchase or not

  • Tools - Python code in Jupyter notebook, Tableau for vizualization, Powerpoint, Excel.

  • Algorithm Used- KNeighborsClassifier, Decision Tree, Random Forest.

    • KNeighborsClassifier - KNeighborsClassifier is a classification algorithm based on the K-Nearest Neighbors (KNN) approach. It is part of the scikit-learn library in Python and is used for solving classification problems. KNN is a simple and intuitive algorithm that classifies a new data point based on the majority class of its k-nearest neighbors in the feature space.

    • Decision Tree - A decision tree is a supervised machine learning algorithm used for both classification and regression tasks. It works by recursively partitioning the dataset into subsets based on the values of different features. The objective is to create a tree-like structure of decisions, where each node represents a decision based on a particular feature, and each leaf node represents the predicted outcome.

    • Random Forest - A Random Forest is an ensemble learning method that operates by constructing a multitude of decision trees during training and outputting the class that is the mode of the classes (classification) or the mean prediction (regression) of the individual trees. It is a powerful and versatile machine learning algorithm known for its high accuracy and robustness.

  • Methodology:

    • Data Collection - The data used to build the model was provided in the csv format. It has gender, age, salary and whether the said person purchased the phone or not in 1 and 0 format, where 1 stands for yes and 0 for no.

    • Data Pre-processing - Data pre-processing is a crucial step to ensure the quality and suitability of the dataset for training machine learning models.

    • Feature Selection - Feature selection is a critical step to identify the most relevant variables that contribute to the predictive power of the model.

    • Model Selection - In the model selection section, provide a detailed overview of the machine learning algorithms chosen for the predictive analysis. Explain the rationale behind the selection of each algorithm and discuss how they align with the project objectives.

    • Model Training - In the model training section, the processed data is fit to train the selected model so that it is able to predict the future entered data.

    • Model Evaluation - In the model evaluation section, the performance of the trained machine learning models is assessed to select the best suited model for deployment.

Campany Sales Prediction

Project-9 - company_data_prj_9

  • Dataset name - Company_Data.csv is the raw data and Company_Data_cleaned.csv is the processed data for model. Both the dataset are in the dataset.zip file.

  • Objective - A cloth manufacturing company is interested to know about the segment or attributes causes high sale.

  • Tools - Python code in Jupyter notebook, Powerpoint, Excel.

  • Algorithm Used - Random Forest

    +Random Forest - A Random Forest is an ensemble learning method that operates by constructing a multitude of decision trees during training and outputting the class that is the mode of the classes (classification) or the mean prediction (regression) of the individual trees. It is a powerful and versatile machine learning algorithm known for its high accuracy and robustness.

    • Methodology:

    • Data Collection - The data used to build the model was provided in the csv format. It has gender, age, salary and whether the said person purchased the phone or not in 1 and 0 format, where 1 stands for yes and 0 for no.

    • Data Pre-processing - Data pre-processing is a crucial step to ensure the quality and suitability of the dataset for training machine learning models.

    • Feature Selection - Feature selection is a critical step to identify the most relevant variables that contribute to the predictive power of the model.

    • Model Selection - In the model selection section, provide a detailed overview of the machine learning algorithms chosen for the predictive analysis. Explain the rationale behind the selection of each algorithm and discuss how they align with the project objectives.

    • Model Training - In the model training section, the processed data is fit to train the selected model so that it is able to predict the future entered data.

    • Model Evaluation - In the model evaluation section, the performance of the trained machine learning models is assessed to select the best suited model for deployment.

fraud Check Prediction

Project-10 - fraud_check_prj_10

  • Dataset name - Fraud_check.csv is the raw data and Fraud_check_cleaned.csv is the processed data for model. Both the dataset are in the dataset.zip file.

  • Objective - A cloth manufacturing company is interested to know about the segment or attributes causes high sale.

  • Tools - Python code in Jupyter notebook, Powerpoint, Excel.

  • Algorithm Used - Random Forest

    +Random Forest - A Random Forest is an ensemble learning method that operates by constructing a multitude of decision trees during training and outputting the class that is the mode of the classes (classification) or the mean prediction (regression) of the individual trees. It is a powerful and versatile machine learning algorithm known for its high accuracy and robustness.

    • Methodology:

    • Data Collection - The data used to build the model was provided in the csv format. It has gender, age, salary and whether the said person purchased the phone or not in 1 and 0 format, where 1 stands for yes and 0 for no.

    • Data Pre-processing - Data pre-processing is a crucial step to ensure the quality and suitability of the dataset for training machine learning models.

    • Feature Selection - Feature selection is a critical step to identify the most relevant variables that contribute to the predictive power of the model.

    • Model Selection - In the model selection section, provide a detailed overview of the machine learning algorithms chosen for the predictive analysis. Explain the rationale behind the selection of each algorithm and discuss how they align with the project objectives.

    • Model Training - In the model training section, the processed data is fit to train the selected model so that it is able to predict the future entered data.

    • Model Evaluation - In the model evaluation section, the performance of the trained machine learning models is assessed to select the best suited model for deployment.

About

Explore a collection of classification machine learning models implemented in Python. This repository showcases a variety of algorithms applied to different datasets, demonstrating their effectiveness in solving classification problems.


Languages

Language:Jupyter Notebook 100.0%