lgraubner / sitemap-generator

Easily create XML sitemaps for your website.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Failed to complete "unzip"

florianginetta opened this issue · comments

Do you want to request a feature or report a bug?
Bug

What is the current behavior?
I ran the crawler on a larger site and I think just when all sites had been added, the program crashed. Looks like some issue with "unzip".

{ added: 3707, ignored: 127, errored: 20 }
zlib.js:100
    buf = (bufs.length === 1 ? bufs[0] : Buffer.concat(bufs, this.nread));
                ^

TypeError: Cannot read property 'length' of null
    at Unzip.zlibBufferOnEnd (zlib.js:100:17)
    at Unzip.emit (events.js:164:20)
    at endReadableNT (_stream_readable.js:1054:12)
    at _combinedTickCallback (internal/process/next_tick.js:138:11)
    at process._tickCallback (internal/process/next_tick.js:180:9)


What is the expected behavior?
I tested the script on a small site and it worked. I don’t know what’s the issue with my case. Probably comes from simplecrawler or lodash?

Code to reproduce
Apparently the program fails after adding 3707 pages…


const SitemapGenerator = require('sitemap-generator');

// create generator
const generator = SitemapGenerator([URL], {
  stripQuerystring: false
});

generator.on('add', (url) => {
  console.log(url);
  console.log(generator.getStats());
});

// register event listeners
generator.on('done', () => {
  // sitemaps created
});

// start the crawler
generator.start();

Is some kind of gzip compression used? Probably caused by simplecrawler, but would have to test it. I doubt the number of pages matters, 3707 is not that much anyways.