How to skip URL with ECONNRESET or ETIMEDOUT errors
harrysayers opened this issue · comments
I'm crawling hundreds of thousands of XML/RSS feeds that are being streamed from a local file using fs.createReadStream... I get to about 16000 links and the same links stop node with the following errors :- ECONNRESET or ETIMEDOUT.... What I want to do is be able to skip the url and continue on with the next links if it has timed out or can't get a secure connection to it etc... How would I do this?
Basic settings for crawler - crawler is called in another file and passed link via "show.FeedLink"
async function crawlLink(show){
const crawler = new Crawler({
maxConnections: 8,
timeout: 15000,
retries: 1,
retryTimeout: 10000,
callback: async function(error, res, done){
if(error){
console.log(error);
reject(done())
}else{
...... Logic saving link to DB......
}
done();
});
crawler.queue({uri: show.FeedLink});
}
Sorry that I'm not familiar with async
function so anybody knows it may help.
Did you find a way to do this? This error shouldnt stop the crawler especially when you're crawling thousands of sites.
I suggest not to use Promise or Async.