SkaarFacee / ScrapeIt

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ScrapeIt πŸ’£

This is basically a webscraper that is capable of scraping a mobile websites for all details about mobile phones listed on their website.
The code has 3 parts :

  • script
  • spider
  • convert

Script file πŸš€

This is the main file that processes the list of phones and get their respective links. As the site is dymanic, the script checks if a specific row number is loaded, if not it then it waits untillt the row number is loaded and then it scrapes the page for all links that lead to the phones specification page. Finally saves the data as a json file.

Spider πŸ•·οΈ

This uses the saved json file. It is of the format phone_name:phone_relative_link. The spider uses this data to crawl into the various websites and saves the data as a dictionary. Finally, the dicts of all the phones is made into a list and saved it into a file

Convert πŸ“‚

This converts the saved file into a csv so that it can be used with more ease

Note the URL used is a public website, but is saved as a variable in the secret.py file

How to use: ☁️

  1. Run script.py
  2. Run spider.py
  3. Run convert.py

Contributing Help πŸ’₯

If you are really interested in contributing to the please follow the below steps and rules.

  1. Fork the project 🍴 (Star ⭐ the repo before that πŸ˜›)
  2. Clone it.
https://github.com/<username>/ScrapeIt.git
  1. Look for any issues clicking the issues tab. Go through it and assign take one. Make sure you get assigned or atleast say that you are gonna work on it.
  2. Always create a new branch and work on the feature or bug. Check this if you are not that familiar with branching, Git Branching.
  3. If you are using any other module for implementing any new features, please install the modules in the virtual environment and update it in the requirements.txt by using the below command.
pip freeze > requirements.txt

If you have any doubts or issues, let the maintainers know about it. They would be ready to help.

About


Languages

Language:Python 100.0%