jmftrindade / redfin_analysis

Dead-simple analysis (e.g., simple histograms) of recently sold homes using Redfin data.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Description

Dead-simple analysis (e.g., simple histograms) of recently sold homes using Redfin data.

Currently only looks at specific types of homes (2 to 3 BR, 1.25+ BA, 1200+ sqft) sold in the last 90 days, and only in select cities from the Greater Boston area.

Requirements

For running Jupyter notebooks:

# Install prereqs.
$ pip install wheel
$ pip install ipykernel jupyter

For running the "analysis" in the notebook:

# Data processing.
$ pip install pandas
$ pip install numpy

# Optional if you do some ML.
$ pip install sklearn

# For seaborn cumulative distplots (aka CDFs).
$ pip install statsmodels

Scrape Redfin Data

Slightly modified version of https://github.com/micahsteinberg/redfin-recently-sold-property-scraper.

Make sure to update the ids of cities of interest, which are currently hardcoded in the script.

This script uses Redfin city ids, and not neighborhood ids, e.g., you want "29663" (https://www.redfin.com/city/29663/MA/Burlington) and not "497396" (https://www.redfin.com/neighborhood/497396/MA/Burlington/Burlington) for Burlington, MA.

To run the scraper:

$ python3 main.py

Run Notebook

$ jupyter notebook

About

Dead-simple analysis (e.g., simple histograms) of recently sold homes using Redfin data.


Languages

Language:Jupyter Notebook 97.3%Language:Python 2.7%