Papenkov / Image-dataset-scraper

Images scraping code from "depositphotos.com" for your CV tasks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Image-dataset-scraper

Images scraping code from "depositphotos.com" for your CV tasks.

"Depositphotos.com" is an international photobank with millions of pictures.

If you need create a unique CV dataset with specific images/targets it will be useful to search whole dataset from one resourse. Depositphotos.com provides adaptive searching and filters to specify your request.

Usage

image

  • Add searching filters
  • And copy the link

main

Example

Download dataset_scraper.py file or just copy it from repo

Open in terminal from folder where you downloaded the dataset_scraper.py

Run command:

python dataset_scraper.py -p "link" -n "number_of_pages_for_scraping" -f "folder_to_save_files"

python dataset_parser.py -p 'https://depositphotos.com/stock-photos/sunglasses-man.html?sh=b7a729fc0832fe1d266e59e5d3701bc47222c6cf&filter=all' -n 10 -f ./dataset/sunglasses_man/

Arguments

Key Value Description Default
p str path to initial link None
n int number of pages for parsing 20
f str path to save images None

If saving folder -f doens't exist - create it.

Note:

one "pages_for_scraping" contains about ~100 images, so if you need for example 1000 images specify -n to "10";

don't forget check the data before training CV model.

About

Images scraping code from "depositphotos.com" for your CV tasks


Languages

Language:Python 100.0%