Real Estate - Scraper

A scraper that gathers data from real estate ads.

Country	Website
Brazil	ZAP Imóveis

Installation

Requirements
Python 3.6
MongoDB

Clone this repository using git and cd into the project folder:

git clone https://github.com/pauloromeira/realestate-scraper.git && \
cd realestate-scraper

Inside project folder, install python requirements using pip:

pip install -r requirements.txt

First, run MongoDB server:

mongod &

Then use the following command to start crawling:

scrapy crawl zap [-a url=<zapimoveis-url>] [-a start=n] [-a count=n] [-a seed=<seed>]

Curently, only ZAP Imóveis is suported

Arguments:

count: limits the number of pages the crawler will search for. The default is to crawl till the end.
start: start crawling from a given page. The default is 1.
url: website url to perform search.
seed: seed for the website search engine.

Default values - properties in Pernambuco, Brazil. Crawl all pages.
```
scrapy crawl zap
```

Olinda-PE. Crawl the first 4 pages.

scrapy crawl zap -a count=4 -a url="https://www.zapimoveis.com.br/venda/imoveis/pe+olinda/"

Rio de Janeiro-RJ - south zone. Starting at page 100, crawl till the end:

scrapy crawl zap -a start=100 -a url="https://www.zapimoveis.com.br/venda/imoveis/agr+rj+rio-de-janeiro+zona-sul/"

All places. Starting from page 4, crawl 3 pages:

scrapy crawl zap -a start=4 -a count=3 -a url="https://www.zapimoveis.com.br/venda/imoveis/"