medOblla / MDLPC_proj

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Project objectif :

Explore at least five supervised models from the list below, to make a forecast of the target variable Display (Y) using as independent variables X1...X7.

Of the proposed models, at least two approaches based on a Datamart with nominal features - consider discretizing the continuous variables intelligently (MDLPC)

What is required to turn in :

Input : Describe any transformation of the input if different from the initial input (New formatting of the input data for a model application) Model :

  • Briefly describe the model
  • Define the role of each parameter and how the model parameters are fitted (setup & fitting parameters)
    Output:
  • Comment on the result of each model
  • Compare the different models

Try the models

  • Logit model: The logit model, also known as a logistic regression model, is used for binary classification problems. It estimates the probability of the target variable being a certain class given the independent variables.

  • ADL model: Linear discriminant analysis (LDA) is a method used for classification and dimensionality reduction. It aims to find a linear combination of the independent variables that best separates the different classes of the target variable.

  • Decision tree: Decision trees are a type of supervised learning algorithm used for both classification and regression problems. They work by recursively partitioning the data into subsets based on the values of the independent variables.

  • Random Forest: A random forest is an ensemble of decision trees. The idea is to average the predictions of many trees to reduce the variance and increase the accuracy of the model.

  • Neural network: Neural networks are a type of machine learning model that are inspired by the structure and function of the human brain. They can be used for a wide range of problems, including classification and regression.

  • Gradient Boosting machine: Gradient Boosting is a machine learning technique for regression and classification problems, which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees.

  • Support Vector Machine: SVM is a supervised learning algorithm that can be used for classification or regression problems. It works by finding the optimal hyperplane that maximally separates the different classes of the target variable.

About


Languages

Language:Jupyter Notebook 97.3%Language:Python 2.7%