lgraubner / sitemap-generator

Easily create XML sitemaps for your website.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Only one URL has been discovered

jean-christophe-manciot opened this issue · comments

Do you want to request a feature or report a bug?
bug

$ npm install -S sitemap-generator
npm WARN saveError ENOENT: no such file or directory, open '/home/actionmystique/.config/sitemap-generator/package.json'
npm notice created a lockfile as package-lock.json. You should commit this file.
npm WARN enoent ENOENT: no such file or directory, open '/home/actionmystique/.config/sitemap-generator/package.json'
npm WARN sitemap-generator No description
npm WARN sitemap-generator No repository field.
npm WARN sitemap-generator No README data
npm WARN sitemap-generator No license field.

+ sitemap-generator@8.4.2
added 39 packages from 64 contributors and audited 58 packages in 4.133s
found 0 vulnerabilities

sitemap-generator.js:

const SitemapGenerator = require('sitemap-generator');

// create generator
const generator = SitemapGenerator('https://git.sdxlive.com', {
  filepath: './sitemap.xml',
  lastMod: true,
  maxDepth: 9999,
  maxEntriesPerFile: 50000,
  stripQuerystring: true
});

// register event listeners
generator.on('done', () => {
  // sitemaps created
});

// start the crawler
generator.start();
node sitemap-generator.js

leads to sitemap.xml:

<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://git.sdxlive.com/</loc>
    <lastmod>2020-03-07</lastmod>
  </url>
</urlset>

@lgraubner What am I missing?

Same issue with sitemap-generator-cli:

$ sudo npm install -g sitemap-generator-cli
/usr/local/bin/sitemap-generator -> /usr/local/lib/node_modules/sitemap-generator-cli/index.js
+ sitemap-generator-cli@7.5.0
added 47 packages from 67 contributors in 2.363s
$ sitemap-generator --last-mod https://git.sdxlive.com

sitemap.xml:

<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://git.sdxlive.com/</loc>
    <lastmod>2020-03-07</lastmod>
  </url>
</urlset>

Facing the same issue did anyone found a workaround?

I also encountered the same problem when I visited my local VuePress document website .

We are experiencing this same problem. Any update on this?

Same here. It's simply not working, generates sitemap with the links on the initial URL only. No deeper crawling.

For anyone else who comes across this, if only your root webpage is included in the sitemap, it usually means that your website pages are being generated client-side by a Javascript framework such as React, Vue, etc. Since the sitemap crawler doesn't execute Javascript, it will just see a mostly blank page. You can confirm this by using curl YOUR_DOMAIN from your terminal...if your page <body> is mostly empty and doesn't contain your actual webpage HTML then you have this problem.

A couple solutions:

  1. Use server-side rendering with your frontend framework (like next.js for React or nuxt.js for Vue) to generate complete HTML pages on the server.

  2. Use a prerendering service like prerender.io or ostr.io to pre-render your pages for search engine crawlers. You can then build the sitemap by telling sitemap-generator to pretend it's Googlebot. This will then tell your site to return the full prerendered HTML page to sitemap-generator. Using the cli version:

sitemap-generator --verbose --max-concurrency 2 --user-agent "Googlebot/2.1 (+http://www.google.com/bot.html)" YOUR_DOMAIN