Images scraping code from "depositphotos.com" for your CV tasks.
"Depositphotos.com" is an international photobank with millions of pictures.
If you need create a unique CV dataset with specific images/targets it will be useful to search whole dataset from one resourse. Depositphotos.com provides adaptive searching and filters to specify your request.
- Go to "depositphotos.com"
- Type your request in search form:
- Add searching filters
- And copy the link
Download dataset_scraper.py file or just copy it from repo
Open in terminal from folder where you downloaded the dataset_scraper.py
Run command:
python dataset_scraper.py -p "link" -n "number_of_pages_for_scraping" -f "folder_to_save_files"
python dataset_parser.py -p 'https://depositphotos.com/stock-photos/sunglasses-man.html?sh=b7a729fc0832fe1d266e59e5d3701bc47222c6cf&filter=all' -n 10 -f ./dataset/sunglasses_man/
Key | Value | Description | Default |
---|---|---|---|
p |
str | path to initial link | None |
n |
int | number of pages for parsing | 20 |
f |
str | path to save images | None |
If saving folder -f
doens't exist - create it.
Note:
one "pages_for_scraping" contains about ~100 images, so if you need for example 1000 images specify -n to "10";
don't forget check the data before training CV model.