stefandoorn / sitemap-plugin

Sitemap Plugin for Sylius eCommerce platform

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Generate sitemaps from CLI

stefandoorn opened this issue · comments

Especially on bigger sitemaps, it makes more sense to generate it and store it in the public folder instead of generating it on each request. As a backup real-time generation could still be an option, but it might be something user can configure.

Proposal:

  • CLI command to generate sitemap index and store it on /sitemap_index.xml
  • CLI command to generate specific sitemap and store it on /sitemap/{provider}.xml
  • CLI command to generate all the above at once

Optional:

  • Allow default routing to still work with on the fly generation, which can be disabled/enabled via a setting
  • If used, call the generator similar to the CLI would do

Investigate / discuss:

  • To store files in the public root, it might need too much rights. Might be better to store in sitemap/index.xml and let the default route sitemap_index.xml perform a redirect.
  • By default the CLI isn't aware of the request context, so https://symfony.com/doc/current/console/request_context.html should be applied. Find out whether that might be difficult for certain users. Might be needed to be overridable for this specific plugin.

Allow default routing to still work with on the fly generation, which can be disabled/enabled via a setting

I think this should be disabled by default basically because it's a trap for putting down your server or at least slowing it down. Also I believe that configuration should have best practice default values, and I would say that creating feeds on the fly is not best practice. But that's my opinion :-)

Regarding the request context
We have the hostname on the channel resource so I think this should be used.

Another thing is that multiple hosts can be used on the same application which means that to create static files we need some kind of pattern for creating the files.

A possible solution to that could be that the sitemap index could be handled with a controller (handling requests to /sitemap_index.xml) that outputted XML taking into consideration the current request context. The static files could then be put in sub directories, i.e. /sitemap/{channel_id}/products_1.xml etc.

Valid comments!

Sitemap index generates to a separate folder including channel prefix for now, on request the controller just fetches it and outputs it. This should obviously later be changed to a stream, but for now it works in the master branch and can be build upon.

Also, we probably need to integrate both index & sitemap generation in a single command, as the index can only be generated after the sitemaps have been integrated as soon as we implement support for splitted sitemaps (#79). Only then the index can know how many files and therefore URL's are generated. Obviously we can scan folders etc., but I think the sitemap providers should report to the index which files they recently generated and this should form the index.