tranquyenbk173 / Crawler_batdongsan.com.vn

Selenium version

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Crawler_batdongsan.com.vn

Selenium version

Thank you for your help!!! Love you <3


You can follow one of two instructions below :p

Instruction 1: Run on your local

  • Download the link_list.txt file to the Crawler_batdongsan.com.vn/ directory. Get link from me :))) >>>>>>

  • Download chrome-driver matching the version of chrome browser you have. Attached file corresponding to Version 87.0.4280.88 (Official Build) (64-bit) - ./chromedriver. Link: https://chromedriver.chromium.org/downloads And fix 100th line in crawl_data_batdongsan.com.vn.py

  • Fix 106th, 107th lines in crawl_data_batdongsan.com.vn.py to define start position and number of records you want to crawl

  • Then, run: python3 crawl_data_batdongsan.com.vn.py in the Terminal or run with GUI

  • After all. When everything is done. Please sent me result file: data_batdongsan_com_vn.csv and start_end.txt

Instruction 2: Run with GColab

  • Download sth from the link I sent you

  • Use Selenium_Real_Estate.ipynb to run on your Google Colab.

  • Then your directory in your GDriver will consists of files below: (links_list.txt, chromedriver, Selenium_Real_Estate.ipynb)

  • You will need to fix sth at 6th cell in Colab file. At lines 107 and 108, fix your start and end index

  • And run :p

  • After all. When everything is done. Please sent me result file: data_batdongsan_com_vn.csv and start_end.txt

Again! Love you <3

and the scrapy version is.... https://github.com/hustducnv/batdongsan.com.vn_spider fasster!

About

Selenium version


Languages

Language:Jupyter Notebook 98.1%Language:Python 1.9%