thequbit / BarkingOwl

scalable web scraper framework for finding documents on websites.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dispatcher is throwing out data blindly ... might want to rethink that

thequbit opened this issue · comments

If the scraper fails, there really is no feedback to the dispatcher that it failed, thus a URL could go without being scrapped. Perhaps some additional handshaking should be added.

This still needs to be addressed. Perhaps some kind of 'check every hour to see if the scraper is still alive and update the database accordingly' solution.

This is handled with the 'scraper_finished' command.