Predicting the impact of different social and economic factors on US Housing Prices
I've organised the analysis is 3 different notebooks:
-
Data Gathering: This notebook involves ingesting data into pandas from online publically available datasets, the data is cleaned and parsed to a proper form. Several different sources of data are employed in this analysis. Here I have combined several sources in to combined dataframes that are used in all other notebooks for analysis. Combined datasets are also saved in data directory, from where they are imported in the other notebooks.
-
EDA: Exploratory data analysis, mainly plots expressing relationship between various features.
-
Model Building: Here, I've trained and evaluated Linear Regressor on data and documented the results. This model is used to understand the affect of different factors on US Single Household Price. Here I've used the S&P/Case-Shiller U.S. National Home Price Index as a proxy for Home prices.
S&P/Case-Shiller Home Price Indices track changes in single-family residential home prices across US. These indices are based on sales pair that have undergone at least two arm's length transactions, which is to say that the repeat-sales method is used. This is to eliminate the problem of accounting for price differences in homes with varying characteristics. Therefore, it can be said that these indices measure change in single household prices over time given a constant level of quality. For this reason I've not used any quality determining characteristics of households like No. of Bedrooms, Bathrooms, Furnishing etcetera.
- Just Click the file you want to explore:
Powererd by raw.githack.com
- Clone the repository or download the zip file.
- Extract the .zip file to an empty directory.
- Use jupyter-notebook or jupyter-lab to open .ipynb files.