- Preprocess the dataset as specified in the data mining process.
- Handle Missing Values and Outliers if any
- Produce Q-Q Plots and Histograms of the features, and apply the transformations if required.
- If it is required, apply suitable feature coding techniques.
- Scale and/or standardized the features, produce relevant graphs to show the scaling/ standardizing effect.
- If necessary, apply feature discretization, and produce a relevant graph to show the discretization
- Perform Feature Engineering by executing the following task:
- Appropriately use PCA (Principal Component Analysis) or SVD (Singular Value Decomposition) for feature reduction.
- Identify significant and independent features using appropriate techniques. Show how you selected the features using suitable graphs.
- Apply the following techniques to predict the value of Y (Estimated Shares Outstanding) for the test dataset (K =10)
- Linear Regression with Cross Validation
- Lasso Regression with Cross Validation
- Ridge Regression with Cross Validation
- Using suitable evaluation matrices, compare the applicability of different regression models on the given Dataset.