flesler / node-spider

Generic web crawler powered by NodeJS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Example doesn't work

joshuaos opened this issue · comments

It seems the example needs some tweaking on the var href line....

this doesn't work:

var handleRequest = function(doc) {
    // new page crawled 
    console.log(doc.res); // response object 
    console.log(doc.url); // page url 
    // uses cheerio, check its docs for more info 
    doc.$('a').each(function(i, elem) {
        // do stuff with element 
        var href = elem.attr('href').split('#')[0];
        var url = doc.resolve(href);
        // crawl more 
        spider.queue(url, handleRequest);
    });
};

this does:

var handleRequest = function(doc) {
    // new page crawled 
    console.log(doc.res); // response object 
    console.log(doc.url); // page url 
    // uses cheerio, check its docs for more info 
    doc.$('a').each(function(i, elem) {
        // do stuff with element 
        var href = doc.$(elem).attr('href').split('#')[0];
        var url = doc.resolve(href);
        // crawl more 
        spider.queue(url, handleRequest);
    });
};

You are right, thank you