Explore at least five supervised models from the list below, to make a forecast of the target variable Display (Y) using as independent variables X1...X7.
Of the proposed models, at least two approaches based on a Datamart with nominal features - consider discretizing the continuous variables intelligently (MDLPC)
Input : Describe any transformation of the input if different from the initial input (New formatting of the input data for a model application) Model :
- Briefly describe the model
- Define the role of each parameter and how the model parameters are fitted (setup & fitting parameters)
Output: - Comment on the result of each model
- Compare the different models
-
Logit model: The logit model, also known as a logistic regression model, is used for binary classification problems. It estimates the probability of the target variable being a certain class given the independent variables.
-
ADL model: Linear discriminant analysis (LDA) is a method used for classification and dimensionality reduction. It aims to find a linear combination of the independent variables that best separates the different classes of the target variable.
-
Decision tree: Decision trees are a type of supervised learning algorithm used for both classification and regression problems. They work by recursively partitioning the data into subsets based on the values of the independent variables.
-
Random Forest: A random forest is an ensemble of decision trees. The idea is to average the predictions of many trees to reduce the variance and increase the accuracy of the model.
-
Neural network: Neural networks are a type of machine learning model that are inspired by the structure and function of the human brain. They can be used for a wide range of problems, including classification and regression.
-
Gradient Boosting machine: Gradient Boosting is a machine learning technique for regression and classification problems, which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees.
-
Support Vector Machine: SVM is a supervised learning algorithm that can be used for classification or regression problems. It works by finding the optimal hyperplane that maximally separates the different classes of the target variable.