tal95shah / OLX_Scraper

:radio: An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OLX_Scraper

An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.

NOTE: This repository is not maintained anymore.

Screenshot

Screenshot

About

A Scrapy Program that scrapes recent ads about products and stores them in MONGODB Database. All the information regarding product to be searched is in args.py Screenshot

Change values after return command

Usage

For proper usage first install selenium and parsel.Open Command Line and type commands given below

pip install pymongo
Configure these Settings in settings.py
ITEM_PIPELINES = { 'olx_scraper.pipelines.MongoDBPipeline': 300, }
MONGODB_SERVER = "localhost" (can be changed) MONGODB_PORT = 27017(Set Whatever port mongodb is running on your system) MONGODB_DB = "" (set this) MONGODB_COLLECTION = "" (set this)
After all the above configurations have been successfully done.Then open command line and type:-
scrapy crawl scrape_olx

Result

Open MongoDB GUI and check database, Your result should be like Screenshot shown above.

Gotchas

1-You must have python 3.6 pre-installed to use this software.
2-Make sure mongodb is running before you run spider.

If

If any issue comes do write in issues column. Thanks!

About

:radio: An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.

License:Apache License 2.0


Languages

Language:Python 100.0%