django-seo-js is a drop-in app that provides full SEO support for angular, backbone, ember, famo.us, and other SPA apps built with django.
It's simple to set up, configurable to use multiple services, and easy to customize.
Quick-links:
-
Pip install:
pip install django-seo-js
-
Add to your
settings.py
:# If in doubt, just include both. Details below. MIDDLEWARE_CLASSES = ( 'django_seo_js.middleware.EscapedFragmentMiddleware', # If you're using #! 'django_seo_js.middleware.UserAgentMiddleware', # If you want to detect by user agent ) + MIDDLEWARE_CLASSES INSTALLED_APPS += ('django_seo_js',) # If you're using prerender.io (the default backend): SEO_JS_PRERENDER_TOKEN = "123456789abcdefghijkl" # Really, put this in your env, not your codebase.
-
Add to your
base.html
{% load django_seo_js %} <head> {% seo_js_head %} ... </head>
-
That's it. :) Your js-heavy pages are now rendered properly to the search engines. Have a lovely day.
Want more advanced control? Keep reading.
For the most part, you shouldn't need to override these - we've aimed for sensible defaults.
# Backend to use
SEO_JS_BACKEND = "django_seo_js.backends.PrerenderIO" # Default
# Whether to run the middlewares and update_cache_for_url. Useful to set False for unit testing.
SEO_JS_ENABLED = True # Defaults to *not* DEBUG.
# User-agents to render for, if you're using the UserAgentMiddleware
# Defaults to the most popular. If you have custom needs, pull from the full list:
# http://www.useragentstring.com/pages/Crawlerlist/
SEO_JS_USER_AGENTS = [
"Googlebot",
"Yahoo",
"bingbot",
"Badiu",
"Ask Jeeves",
]
# Urls to skip the rendering backend, and always render in-app.
# Defaults to excluding sitemap.xml.
SEO_JS_IGNORE_URLS = [
"/sitemap.xml",
]
SEO_JS_IGNORE_EXTENSIONS = [
".xml",
".txt",
# See helpers.py for full list of extensions ignored by default.
]
django-seo-js defaults to using prerender.io because it's both open-source if you want to run it yourself, and really reasonably priced if you don't.
To use prerender.io,
# Prerender.io token
SEO_JS_PRERENDER_TOKEN = "123456789abcdefghijkl"
# Optional timeout option (following requests timeout format)
SEO_JS_PRERENDER_TIMEOUT = (5,5)
You don't need to set SEO_JS_BACKEND
, since it defaults to "django_seo_js.backends.PrerenderIO"
.
If you're hosting your own instance of prerender, (there are docker imagess, for those inclined,) configuration is similar
SEO_JS_BACKEND = "django_seo_js.backends.PrerenderHosted"
SEO_JS_PRERENDER_URL = "http://my-prerenderapp.com/" # Note trailing slash.
SEO_JS_PRERENDER_RECACHE_URL = "http://my-prerenderapp.com/recache"
If it's a backend for a public service, please consider submitting your backend as a PR, so everyone can benefit!
Backends must implement the following methods:
class MyBackend(SEOBackendBase):
def get_response_for_url(self, url):
"""
Accepts a fully-qualified url.
Returns an HttpResponse, passing through all headers and the status code.
"""
raise NotImplementedError
def update_url(self, url):
"""
Force an update of the cache for a particular URL.
Returns True on success, False on fail.
"""
raise NotImplementedError
If you're hitting an http endpoint, there's also the helpful RequestsBasedBackend
, which has a build_django_response_from_requests_response
method that transforms a python-requests response to a django HttpResponse, including headers, status codes, etc.
If you know a page's contents have changed, some backends allow you to manually update the page cache. django-seo-js
provides helpers to make that easy.
from django_seo_js.helpers import update_cache_for_url
update_cache_for_url("/my-url")
So, for instance, you might want something like:
def listing_changed(sender, instance, created, **kwargs):
update_cache_for_url("%s%s" % ("http://example.com/", reverse("listing_detail", instance.pk))
post_save.connect(listing_changed, sender=Listing)
If you need to customize the fully-qualified URL, you can subclass any backend and override the build_absolute_uri()
method.
class MyBackend(SEOBackendBase):
def build_absolute_uri(self, request):
"""Strip out all query params:"""
return '{scheme}://{host}{path}'.format(
scheme=self.scheme,
host=self.get_host(),
path=self.path,
)
If you're looking for a big-picture explanation of how SEO for JS-heavy apps is handled, the clearest explanation I've seen is this StackOverflow answer.
If even that's TL;DR for you, here's a bullet-point summary:
- If requests come in with an
_escaped_fragment_
querystring or a particular user agent, a pre-rendered HTML response is served, instead of your app. - That pre-rendered HTML is generated by a service with a headless browser that runs your js then caches the rendered page.
- Said service is generally a third party (there are many: prerender.io, Brombone, seo.js, seo4ajax.) You can also run such a service yourself, using prerender, or re-invent your own wheel for fun.
PRs with additional backends, bug-fixes, documentation and more are definitely welcome!
Here's some guidelines on new code:
- Incoming code should follow PEP8 (there's a test to help out on this.)
- If you add new core-level features, write some quick docs in the README. If you're not sure if they're needed, just ask!
- Add your name and attribution to the AUTHORS file.
- Know you have everyone's thanks for helping to make django-seo-js even better!
Anyone is welcome to contribute to django-seo-js, regardless of skill level or experience. To make django-seo-js the best it can be, we have one big, overriding cultural principle:
Be kind.
Simple. Easy, right?
We've all been newbie coders, we've all had bad days, we've all been frustrated with libraries, we've all spoken a language we learned later in life. In discussions with other coders, PRs, and CRs, we just give each the benefit of the doubt, listen well, and assume best intentions. It's worked out fantastically.
This doesn't mean we don't have honest, spirited discussions about the direction to move django-seo-js forward, or how to implement a feature. We do. We just respect one other while we do it. Not so bad, right? :)
django-seo-js was originally written and is maintained by Steven Skoczen. Since then, it's been improved by lots of people, including (alphabetically):
- alex-mcleod brought you the idea of ignoring certain urls via
SEO_JS_IGNORE_URLS
. - andrewebdev improved the user-agent list to be more comprehensive.
- chazcb added the
build_absolute_uri
method, for subclassing in complex, generated setups. - denisvlr fixed the
update_url
method. - mattrobenolt mad things faster, better, and stronger.
- rchrd2 fixed a breaking bug with the user agent middleware.
- thoop gave you
SEO_JS_IGNORE_EXTENSIONS
, allowing you to ignore by extension.
Original development was at GreenKahuna (now defunct.)
- Deprecation:
django_seo_js.middleware.HashBangMiddleware
is now calleddjango_seo_js.middleware.EscapedFragmentMiddleware
, to fix confusion.HashBangMiddleware
will be removed in 0.5. Which I would bet is probably late 2015, early 2016. You'll see a log warning from now on. Thanks to thoop for the report. - Bugfix to user agent middleware not respecting
ENABLED
, thanks to rchrd2. Also reported by denisvlr. - New (backwards-compatible)
build_absolute_uri
method that can be overridden, thanks to chazcb. - Removed Google, Yahoo, and Bing from the default
USER_AGENTS
, since they now support the escaped fragment protocol (and leaving them in can cause a cloaking penalty.) Thanks to thoop for pointing this out.
- Fixes to the
update_url
method, thanks to denisvlr. - Optimizations in lookups, thanks to mattrobenolt.
- Changes behavior to more sanely not follow redirects, per #9, thanks to denisvlr and mattrobenolt.
- Adds a few more user agents to the defaults, per #7, and the suggestion of andrewebdev
- Adds an optional
SEO_JS_IGNORE_EXTENSIONS
setting that contains a list of extensions to ignore, thanks to the suggestion by thoop.
- Adds an optional
SEO_JS_IGNORE_URLS
setting, that contains a list of urls to ignore, thanks to the sitemap.xml prerender bug reported by alex-mcleod.
- Backwards incompatible changes to
SEOBackendBase
- all backends are now expected to return anHttpResponse
for theirget_response_for_url
methods. If you have custom backends, they'll need to be updated. All included backends have been updated, so if you're using an included backend, you can just pip install the new version, and go. - Returns pages that come back from the cache with anything besides a
5xx
status code. - Passes on headers, content type, and status code from the cache response.
- If the backend return a
5xx
status, just returns the normal app and hopes for the best.
- Adds a
SEO_JS_ENABLED
setting, so you can disable hooks and middlewares during tests.
- Handles cases where a request didn't come with a User-agent.
- Improvements to unit tests.
- Includes
PrerenderIO
andPrerenderHosted
backends. - First release - we're using this in production at GreenKahuna ScrapBin.