A Sphinx extension to generate multiversion and multilanguage sitemaps.org compliant sitemaps for the HTML version of your Sphinx documentation.
Directly install via pip by using:
pip install sphinx-sitemap
Add sphinx_sitemap
to the extensions array in your Sphinx conf.py. For example:
extensions = ['sphinx_sitemap']
Set the value of html_baseurl in your Sphinx conf.py to the current base URL of your documentation. For example:
html_baseurl = 'https://my-site.com/docs/'
After the HTML build is done, sphinx-sitemap will output the location of the sitemap:
sitemap.xml was generated for URL https://my-site.com/docs/ in /path/to/_build/sitemap.xml
Note: Make sure to confirm the accuracy of the sitemap after installs and upgrades.
For multiversion sitemaps, you have to generate a sitemap per version and then manually add their locations to a sitemapindex file.
The extension will look at the version config value for the current version being built, so make sure that is set.
Note: When using multiple versions, it is best practice to set the canonical URL in the theme layout of all versions to the latest version of that page:
<link rel="canonical" href="https://my-site.com/docs/latest/index.html"/>
For multilingual sitemaps, you have to generate a sitemap per language/locale and then manually add their locations to a sitemapindex file.
Primary language is language config value. Alternative languages are either manually set by sitemap_locales
option or auto-detected by the extension from the locale_dirs config value, so make sure one of those is set.
sitemap_locales
configuration is handy you want to list in the sitemap only some of existing locales, if third-party extension adds locale_dirs to Sphinx for the languages which you don't support in your docs, or to "exclude" primary language (language). For example, if primary language is en, sitemap will contain it twice:
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://my-site.com/docs/en/index.html</loc>
<xhtml:link href="https://my-site.com/docs/es/index.html" hreflang="es" rel="alternate"/>
<xhtml:link href="https://my-site.com/docs/fr/index.html" hreflang="fr" rel="alternate"/>
<xhtml:link href="https://my-site.com/docs/en/index.html" hreflang="en" rel="alternate"/>
</url>
<url>
<loc>https://my-site.com/docs/en/about.html</loc>
<xhtml:link href="https://my-site.com/docs/es/about.html" hreflang="es" rel="alternate"/>
<xhtml:link href="https://my-site.com/docs/fr/about.html" hreflang="fr" rel="alternate"/>
<xhtml:link href="https://my-site.com/docs/en/about.html" hreflang="en" rel="alternate"/>
</url>
</urlset>
If you limit sitemap:
sitemap_locales = ['es', 'fr']
The end result is something like the following for each language/version build:
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://my-site.com/docs/en/index.html</loc>
<xhtml:link href="https://my-site.com/docs/es/index.html" hreflang="es" rel="alternate"/>
<xhtml:link href="https://my-site.com/docs/fr/index.html" hreflang="fr" rel="alternate"/>
</url>
<url>
<loc>https://my-site.com/docs/en/about.html</loc>
<xhtml:link href="https://my-site.com/docs/es/about.html" hreflang="es" rel="alternate"/>
<xhtml:link href="https://my-site.com/docs/fr/about.html" hreflang="fr" rel="alternate"/>
</url>
</urlset>
If you set special value [None]
:
sitemap_locales = [None]
only primary language is generated:
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://my-site.com/docs/en/index.html</loc>
</url>
<url>
<loc>https://my-site.com/docs/en/about.html</loc>
</url>
</urlset>
If you have both language
and version
set, the default URL format is {version}{lang}{link}
. To change the default behavior, set the value of sitemap_url_scheme
in your Sphinx conf.py to the desired format. For example:
sitemap_url_scheme = "{lang}{version}subdir/{link}"
Note: The extension is currently opinionated, in that it automatically appends trailing slashes to both the language
and version
values. You can also omit values from the scheme for desired behavior.
Add a robots.txt file in the source directory which contains a link to the sitemap or sitemapindex. For example:
User-agent: * Sitemap: https://my-site.com/docs/sitemap.xml
Then, add robots.txt to the html_extra_path config value:
html_extra_path = ['robots.txt']
- Submit the sitemap or sitemapindex to the appropriate search engine tools.
You can use GitHub search or libraries.io to see who is using sphinx-sitemap.
Pull Requests welcome! See CONTRIBUTING for instructions on how best to contribute.
sphinx-sitemap is made available under a MIT license; see LICENSE for details.
Originally based on the sitemap generator in the guzzle_sphinx_theme project, also licensed under the MIT license.