crystal-lang / crystal-website

crystal-lang.org website

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Pretty URLs: Problems with `.html` extension in file names

straight-shoota opened this issue · comments

In Jekyll, the output paths for pages have a .html extension by default. For example: /community/governance.html
This is not very pretty and practically irrelevant.
Users typically expect extension-free URLs. An example for that are the links to https://crystal-lang.org/community/governance from crystal-book. This URL is currently a 404 because the extension is missing. It must be https://crystal-lang.org/community/governance.html

A trivial fix for the correct URL is in crystal-lang/crystal-book#643

But it's a general problem that people write pretty URLs. You shouldn't have to type .html at the end of a URL.

I propose to change the default url format for pages to permalink: /:path/:basename/. This generates pretty urls (it basically creates folders with a index.html which the web server then serves for the folder path).

A superior solution to keep both variants (for not breaking existing links) would be to make the .html extension optional directly in the web server. For nginx config this would be something like try_files $uri $uri.html $uri/ =404;.
I presume there is nothing like this for S3 where we're hosting the website, right? @matiasgarciaisaia any idea?

We're hosting on AWS S3, which supports object-level redirections.

It'd be awesome if we found a Jekyll plugin that takes care of setting up the redirections, but, if not, we could manually list every URL we currently have (ie, the current sitemap), then enable pretty URLs, deploy and set up each redirection we need.

The general redirection rules match by prefix (not suffix), so we won't be able to use a single, general rule.

Yeah, I don't think custom rules in S3 for every single path are a practical solution.

So probably a good path forward is to setup Jekyll to generate pretty URLs by default. For existing pages with .html paths, we can create custom redirects to the pretty paths inside Jekyll.

I would ignore blog posts for now.

That leaves us with only a few paths:

$ find _site -name '*.html' -not -name index.html -not -path '_site/20*'
_site/community/governance.html
_site/docs.html
_site/sponsors/original-sponsors.html
_site/learning/crystal_programming.html
_site/404.html

404.html is only used internally, so we can ignore that.

Is there any reason to avoid the blogposts? I'd do it for every page, so we forget this existed 😇

Post paths are already clunky, even without .html extension 🤷 This means it's less a problem because you don't just type a post URL somewhere, it's always copy & paste. That's different with easier paths for pages that also serve as landing pages.

And it's more complex to implement individual redirects for 100+ post paths.
Maybe a plugin could help with that, but I just think it's probably not worth the effort if we can't have a simple, general redirect rule configuration somewhere.

I think the same script we'll use to create an object-by-object redirect (using the output of your find command up there) would be enough to also redirect blog posts.

I mean - it's probably just you and me manually typing URLs. People don't do that :P Let's aim for consistency, at least.

Not sure how exactly you want to do that. But I've created #352 which moves the pages to pretty URLs.

For blog posts, the change is trivial:

--- i/_config.yml
+++ w/_config.yml
@@ -53,7 +53,7 @@ defaults:
       type: releases
     values:
       layout: post
-      permalink: /:year/:month/:day/:title.html
+      permalink: /:year/:month/:day/:title
       image: /assets/icon.png

 twitter:

Then we just need redirects.

Resolved by #352