Crawler

Crawler is a web-crawler to scrape images of beautiful girls from webpage: http://www.kindgirls.com/

Testing Environment

Clone this repo
Run crawler.py: python crawler
Input the start time and end time in to form of <year>-<month>. Notice that the acceptable range is from 2003-07 to the present month
Wait for a while and go to the "image" folder in the same directory as crawler.py

This web-crawler is only used to scrape images from Kindgirls, and may not work on other websites with different structure.
Within each gallery (<year>/<month>/<girlname>), crawler creates a thread for each image. Threads in queue start with a 0.5s delay after each. After a gallery is scraped, crawler suspends for a random number (1-5) of seconds to avoid being detected as a robot.
Occasionally, the requests from crawler would be refused by remote host for unkown reasons. In this case, the crawler would sleep for 10 minutes and restart.