microsoft / ghcrawler

Crawl GitHub APIs and store the discovered orgs, repos, commits, ...

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

review error crawler error handling

jeffmcaffer opened this issue · comments

From time to time there is evidence that some errors can leak out of the crawler loop. in particular, it seems possible for something marked as "being processed" to not get unmarked when it is done. This causes subsequent loops to get a request and think that it is already being processed (ie.., a collision).

Do a deep review and update tests to validate that all reject and throw cases are being handled in the loop.