edgarmuyomba / logo-scraper

A python scraper used to download company logos by category

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Logo Scraper

This is a webscraper built using python to collect and download logos for various companies from 1000logos. It uses a combination of python requests and beautiful soup to collect the top10 logos for the specified categories and downloads their images automatically.

It uses the python threading module to download images for each of the categories simultaneously to make the process much faster. It also uses the session module to set pre-defined custom headers ensuring that the scraper is not blocked from the website.

This project was used to collect data as part of the preparatory stages for the Logo API - A free api used to retrieve various company logos.

Requirements

  1. Make sure to create the category folders you want
  2. The names of the folders should match the categories listed in index.py
  3. Make sure to specify the default image type while downloading. the default is set to png
  4. Make sure to install all listed requirements in the requirements.txt

Built using

  1. Requests
  2. Session
  3. BeautifulSoup4
  4. Threading

Caution

The scraper might fail to work as intended due to changes in the website layout. If you're proficient in webscraping using python, feel free to modify the code otherwise regular maintenance updates will be rolled out.

About

A python scraper used to download company logos by category


Languages

Language:Python 100.0%