karenwky / Visualization_Hong_Kong_Property

web scraping and EDA(Exploratory Data Analysis) project

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Visualization: Hong Kong Property

With data scraped from Hong Kong Property website, a EDA(Exploratory Data Analysis) project is conducted.

Data Source

  1. Transaction History
    Having transaction data within 3 years, explore the distribution of property transaction in Hong Kong.

  2. Find Property
    Scraping property details such as number of bedrooms and selling price, discover the correlation between various features.

Findings

Property Transactions in HK by District
Two clusters can be simply divided according to the data. For the purple districts, only a few number of estates have relatively high number of transactions in the district. On the contrast, for the cyan districts, relatively high number of estates have high number of transactions in the district. The purple districts are having a more skewed distribution than the cyan districts.


Box Plot
Most of the number of transactions are within the range 200. N.T. East(grey box) has the highest upper extreme and highest median for number of transaction. Kowloon Central(purple box) has the highest number of outliers for number of transaction.


Bar Chart
From the total number of transaction data, slightly difference is shown compared with distribution data by estate. Although N.T. East has the highest upper extreme and highest median for number of transaction(refer to the boxplot above), it is only the third highest district in total number of transaction. But with a lot of outliers, Kowloon Central is the district with highest total number of transaction.


3D Scatter Plot
From the scraped data, the selling price is mostly within HKD 50 million. The flats mainly have 1-4 bedrooms, and the efficiency ratio is mostly between the range 60 to 90 percent.

Detailed Presentation

Slides

Skills Acquired

  • Pandas
  • Beautiful Soup
  • Regex text cleaning
  • Seaborn
  • Matplotlib

About

web scraping and EDA(Exploratory Data Analysis) project


Languages

Language:Jupyter Notebook 100.0%