ScrapeIt 💣

This is basically a webscraper that is capable of scraping a mobile websites for all details about mobile phones listed on their website.
The code has 3 parts :

script
spider
convert

Script file 🚀

This is the main file that processes the list of phones and get their respective links. As the site is dymanic, the script checks if a specific row number is loaded, if not it then it waits untillt the row number is loaded and then it scrapes the page for all links that lead to the phones specification page. Finally saves the data as a json file.

Spider 🕷️

This uses the saved json file. It is of the format phone_name:phone_relative_link. The spider uses this data to crawl into the various websites and saves the data as a dictionary. Finally, the dicts of all the phones is made into a list and saved it into a file

Convert 📂

This converts the saved file into a csv so that it can be used with more ease

Note the URL used is a public website, but is saved as a variable in the secret.py file

How to use: ☁️

Run script.py
Run spider.py
Run convert.py

Contributing Help 💥

If you are really interested in contributing to the please follow the below steps and rules.

Fork the project 🍴 (Star ⭐ the repo before that 😛)
Clone it.

https://github.com/<username>/ScrapeIt.git

Look for any issues clicking the issues tab. Go through it and assign take one. Make sure you get assigned or atleast say that you are gonna work on it.
Always create a new branch and work on the feature or bug. Check this if you are not that familiar with branching, Git Branching.
If you are using any other module for implementing any new features, please install the modules in the virtual environment and update it in the requirements.txt by using the below command.

pip freeze > requirements.txt

If you have any doubts or issues, let the maintainers know about it. They would be ready to help.

SkaarFacee / ScrapeIt