Sitemap.xml is not being parsed correctly

Question

Sitemap.xml is not being parsed correctly

mkantautas opened this issue 6 years ago · comments

Mantas Kantautas commented 6 years ago

e.g. https://taxibambino.com/sitemap.xml only 2 pages are being parsed.

Lars Graubner · Answer 1 · Tue May 01 2018 23:56:20 GMT+0800 (China Standard Time)

This tool does not parse the sitemap.xml file, it creates one.

Mantas Kantautas · Answer 2 · Wed May 02 2018 13:36:41 GMT+0800 (China Standard Time)

I just assumed, because simplecrawler parses sitemaps directives by default and as I understand simplecrawler is the core of this package. Any the main issue seems to be with the site itself, giving inconsistent results - one day sitemap generator works(indexing all pages), the next day it only catches the main page and main page's sitemap.xml (because there is a link to it in the robot.txt) - otherwise it doesn't finds sitemap.xml

Lars Graubner · Answer 3 · Wed May 02 2018 14:00:53 GMT+0800 (China Standard Time)

Actually you are right. If the sitemap is linked in the robots.txt it should be parsed.