SaadARazzaq / Gradient-Descent-LR

Code without built in ML libraries => ML ASSIGNMENT 1 Q1

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

House Price Prediction Analysis ๐Ÿ ๐Ÿ’ฐ

Overview โ„น๏ธ:

This repository contains Python code to analyze and predict house prices using linear regression. It includes preprocessing steps, gradient descent implementation, closed-form solution computation, visualization, and comparison of results.

Instructions ๐Ÿ› ๏ธ:

  1. Open the notebook in Google Colab.
  2. Ensure the required libraries (Pandas, NumPy, Matplotlib) are installed.
  3. Upload the DataX.dat and DataY.dat files to your Colab environment.
  4. Execute the code cells sequentially to preprocess data, train the model, and visualize the results.

Code Snippet ๐Ÿ“„:

# See the notebook file in the repo

Results ๐Ÿ“ˆ:

Case A:

  • Values of parameters:

    • Intercept (ฮธ0): 335,373.9
    • Coefficient for Living Area (ฮธ1): 100,780.44
    • Coefficient for Bedrooms (ฮธ2): 38,772.17
    • Coefficient for Floors (ฮธ3): -40,555.62
  • Scale of Parameter Values:

    • The parameter values are in a reasonable range and can be interpreted directly. For example, the coefficient for the Living Area (ฮธ1) suggests that, on average, an increase of 1 unit in the living area results in an increase of approximately $100,780.44 in house price.
  • Interpretability:

    • Case A provides interpretable coefficients that can be used to understand the relationships between the input features (Living Area, Bedrooms, and Floors) and the output (House Prices).

Case B:

  • Closed form solution:

    • These parameter values are extremely large, e.g., ฮธ0 โ‰ˆ 3.75e+15, ฮธ1 โ‰ˆ 8.67e+18, ฮธ2 โ‰ˆ 1.24e+16, ฮธ3 โ‰ˆ 8.56e+15.
  • Comparison:

    • Scale of Parameter Values:

      • The parameter values in Case B are exceptionally large and likely indicate a problem with the model or numerical instability. It's not practical to interpret such large parameter values in the context of a linear regression model.
    • Interpretability:

      • Case B Parameter values are so large that they do not provide meaningful insights or interpretations.
  • Potential Issues:

    • Case B Parameter values suggest numerical instability or problems with the modeling process. Such large parameter values are typically indicative of issues like multicollinearity, improper feature scaling, or divergence in the optimization algorithm.

Conclusion ๐Ÿ“:

  • Gradient Descent: Provides parameter values suitable for interpretation, with reasonable magnitudes indicating the impact of features on house prices.
  • Closed-form Solution: May encounter numerical instability issues, leading to extremely large parameter values, making interpretation challenging.

Note ๐Ÿ“Œ:

  • Ensure proper preprocessing and feature scaling to enhance model performance and stability.
  • Experiment with different learning rates and iterations for gradient descent optimization.
  • Visualize data and results to gain insights into the relationships between features and house prices.

About

Code without built in ML libraries => ML ASSIGNMENT 1 Q1


Languages

Language:Jupyter Notebook 100.0%