adamduncan / eleventy-plugin-i18n

Eleventy plugin to assist with internationalization and dictionary translations

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feature request: Automatically create output files for each locale

icaliman opened this issue · comments

Suppose we have these source files:

├─ src
   ├─ about.njk
   └─ index.njk

In these files we use the i18n universal filter for all texts like: {{ 'hello' | i18n }}

Is it possible to generate output files for each locale like bellow?

├─ output
   └─ en
       ├─ about.html
       └─ index.html
   └─ es
       ├─ about.html
       └─ index.html
   ├─ about.html      <--- default locale
   └─ index.html      <--- default locale

I like this idea because I wouldn't have to create multiple files for each page on our site, at least not the main pages.

On the site we have to have all pages available in French and English. I would rather have a way to create one file and hold all the content for both languages. At least for our main landing pages that will be collections of tags that we add to our documentation pages.

What I am trying to do is put the information in the front matter for both languages and pulling it into the pages.

Using slugify for the URL with the language code.

Not sure if this would work but I think it would be great. Eventually using Netlify / Decap CMS to allow our content owners to maintain the pages themselves.

---
title:
  - en: Accessibility fundamentals
  - fr: Principes fondamentaux de l'accessibilité
layout: layouts/landing.njk
permalink: "/{{ lang }}/{{ title | slugify }}/"
description:
  - en: Discover the accessibility principles behind creating digital products and services that remove barriers for people living with disabilities while ensuring ease of use for all.
  - fr: Découvrez les principes d'accessibilité qui guident la création de produits et de services numériques qui éliminent les obstacles pour les personnes handicapées tout en assurant une facilité d'utilisation pour tous. 
keywords:
  - en:
    - Web Accessibility
    - Assistive Technology
  - fr:
    - Accessibilité du Web
    - Technologie d'assistance
---

Thoughts?

I think that's a great idea, I have tried using the plugin myself and was a little disappointed about the fact that I'd have to maintain two identical files that translate into one locale each instead of having one document as a single source of truth like in other i18n libraries. I tried to patch a solution together myself by copying the files in the src directory into their respective locale directory and then let 11ty build the project, but it seems like when you have these "duplicated" files also in your .gitignore that this approach stops to work. Seeing that the whole point of 11ty is to reduce redudancy while writing what ends up being static files, this design decision seems rather odd to me.

@adamduncan, your work here is very much appreciated, but could you say something about the state of the project? I see some PRs that have been open for quite some time now, so I am not sure if this project is still maintained or not. Would be great if you could shed some light on the future of this project.

Hey folks, apologies for the delay in responding.

I can understand where you're coming from in not wanting to have to maintain many discrete sets of pages for each respective locale site. Fundamentally though, this plugin aligns with the prevailing approach at the time of its development (and subsequently documented in 11ty's Internationalization docs). The core idea here being that while the structure of pages could be the same/similar in each respective language site, each of the page's frontmatter and contents would differ, and in fact there mightn't be parity in the pages available, nor the translated page paths. (To choose a random but common example, the French Ministry site demonstrates these characteristics.)

I'm trying to better understand the goals of a different design for a single set of pages, and whether everyone here is talking about the same solution or variations on a common theme. I'm not sure I understand how the current characteristics would be achieved with the proposed approach.

@icaliman, to use your example, can you elaborate on how you envisage authoring the content for the different en/es translations? Where are the common translation terms for i18n managed? How would one manage the different language content for the About page in English and Spanish? (Likewise @shawnthompson, for your en/fr frontmatter example?)

@StefanGreve For the needs of the plugin's initial design, I feel it's something that still serves its modest purpose. I have been intending to revisit the open PRs and their API change suggestions, which I should've done long ago. I'd say for the most part, they'd been looking to introduce API changes that don't seem to serve a critically broad need, and so I'd question the value in altering/extending the API surface. Will comment/close each accordingly.

Re: the future of the project: It's not something I've got much time to dedicate to, but I had wanted to bring up to date with the latest 11ty Plugin conventions (this plugin was created in a pre-v1 11ty world, where plugins were fewer and documentation on creating them fairly sparse), as well as addressing any design suggestions to cover open issues/PRs as best possible. At its core, the plugin does make some fairly brittle assumptions about file path structure to deduce locale, which I'd also like to revisit in a major version update when I have the bandwidth. Realistically, I couldn't put a date on when that would be, but hoping to carve out some time when I'm next actively developing an 11ty project.

@adamduncan Reading your code I can understand why you went for that approach, it's a very straightforward solution that requires admirably few lines of code. The main issue I take with this plugin is that it requires me to keep identical copies of each template file. Taking @icaliman folder structure for example, I'd write something like

# en/index.njk
---
title: Hello
layout: layouts/base.njk
---

<p>{{ 'hello' | i18n }}</p>

and

# es/index.njk
---
title: Hello
layout: layouts/base.njk
---

<p>{{ 'hello' | i18n }}</p>

which seems okay with these minimal examples, but if you try to apply that to a real-world project you end up maintaining at least one copy per source file per locale. One could probably find a way around this by using layout files for the home page, about page, etc. but that's just working around the limitations of a plugin; it also makes for a different project structure than would you typically expect to see in a 11ty project. If you think this feature request is out of scope for this project then that's totally fine; thanks so much for your fast response! :)

@StefanGreve No problem, appreciate the elaboration.

Just making sure I'm clear on terms: So I'd expect a multi-lingual project using this plugin to maintain individual content pages for each respective language. Whereas the layouts (templates you're referring to?) that hold content would be generic and language-agnostic (perhaps leveraging the i18n helper throughout for pieces of microcopy if need be).

This is perhaps where I'm not following the outcome you're referring to where we have identical duplicate pages across different language sites. Maybe you could help to clarify?

E.g. In my mental model, it'd be more like:

# en/index.md
---
title: Hello
layout: layouts/base.njk
---

# Hello

[A bunch of content in English...]

and

# es/index.md
---
title: Hola
layout: layouts/base.njk
---

# Hola

[A bunch of content in Spanish...]

...and so the translated content necessitates there being per-language content pages.

Then in the shared layouts, should there be the need for translations, we might find ourselves leveraging the i18n filter, like:

{# layouts/base.njk #}

<header>
  <button>{{ 'menu' | i18n }}</button>
</header>

<main>
  {{ content }}
</main>

I can certainly see the @shawnthompson's aim in trying to consolidate the title/description etc. frontmatter into a single site tree with multiple languages declared in frontmatter, but think it's fundamentally the inverse approach at the problem space. Given the different language sites' content needs, I'd use that to drive the solution, hence arriving at the multiple site trees approach suggested in the 11ty docs.

Hope that helps

Yes thanks @adamduncan for the reply and the work you've already put into this.

I think what we (at lease we) are trying to do is avoid having to update multiple files throughout the source tree if the same update is found in all variants of the pages.

├─ src
   └─ en
       ├─ about.html
       └─ index.html
   └─ es
       ├─ about.html
       └─ index.html

Let's say you need to change the first paragraph on both the English and Spanish about.html page, you would need to update and commit both files.

That might be ok if you're only using two language but what if you're are using many different languages (which I'm happy to only have to support two languages).

Or lets say we organize our pages into subfolders, it would be nice to only have to create one file and folder and use the frontmatter (or a .json in the folder) to define the values needed for the layout.

Only a part of our current source tree:

[root],
└─ src,
   ├─ en,
   │  ├─ guides,
   │  │  ├─ design-accessible-services.html,
   │  │  ├─ ict-requirements.html,
   │  │  ├─ improving-form-accessibility.html,
   │  │  ├─ index.html,
   │  │  ├─ office2016,
   │  │  │  ├─ accessible-visio-diagrams.html,
   │  │  │  ├─ accessible-word-documents.html,
   │  │  │  ├─ index.html,
   │  │  ├─ office365,
   │  │  │  ├─ accessible-visio-diagrams-365.html,
   │  │  │  ├─ accessible-word-documents-365.html,
   │  │  │  ├─ index.html,
   │  │  ├─ personas,
   │  │     ├─ index.html,
   │  │     ├─ jim.html,
   │  │     ├─ julie.html,
   │  │     ├─ steve.html,
   │  │     ├─ juan.html,
   │  │  └─ virtual-meetings,
   │  │     ├─ index.html,
   │  │     ├─ hybrid.html,
   │  ├─ index.html,
   ├─ fr,
   │  ├─ guides,
   │  │  ├─ design-accessible-services.html,
   │  │  ├─ ict-requirements.html,
   │  │  ├─ improving-form-accessibility.html,
   │  │  ├─ index.html,
   │  │  ├─ office2016,
   │  │  │  ├─ accessible-visio-diagrams.html,
   │  │  │  ├─ accessible-word-documents.html,
   │  │  │  ├─ index.html,
   │  │  ├─ office365,
   │  │  │  ├─ accessible-visio-diagrams-365.html,
   │  │  │  ├─ accessible-word-documents-365.html,
   │  │  │  ├─ index.html,
   │  │  ├─ personas,
   │  │     ├─ index.html,
   │  │     ├─ jim.html,
   │  │     ├─ julie.html,
   │  │     ├─ steve.html,
   │  │     ├─ juan.html,
   │  │  └─ virtual-meetings,
   │  │     ├─ index.html,
   │  │     ├─ hybrid.html,
   │  ├─ index.html,
   ├─ index.html,

What we did was create a script to create a page to ensure we create it in both the right spots.

gc-da11yn.github.io/scripts/create-new-folder.js at main · Work in progress

Thanks @shawnthompson.

...is avoid having to update multiple files throughout the source tree if the same update is found in all variants of the pages

Let's say you need to change the first paragraph on both the English and Spanish about.html page, you would need to update and commit both files.

I think these two points are where our understanding is diverging. I'd have thought that these statements would be necessary/desirable steps should the content need to change. I.e. It makes sense that if the intro paragraph of the About page needed to change, it would be necessary to change it both in English and in French. The contents of them wouldn't be the "same" per se (different languages) and so would both need updating independently.

For example, the Curriculum > Web Application Developer content all needs translating, both in English and in French. I'm not sure I see how all of that content for both languages (or more) could be co-located in a single file in a manageable way (i.e. within frontmatter/data as is being proposed), even if it was abstracted to some replicable structured format. Can you demonstrate what you mean with regards to the content itself?

(I can totally relate to wanting to not have to manage the fairly "templated" HTML itself in multiple files—multi-lingual in SSG can be a laborious process—just not sure I can envisage a way to consolidate so much content into a single file in a way that's intuitive to manage.)

What if we were using .njk files as the main pages instead of using .html and setting a variable in the content area or something that would allow the plugin to tell the deference between the language parts.

{# English content starts here #}
{% set lang = "en" %}
## This is an English heading level 2

Content goes here: Lorem ipsum dolor sit amet consectetur adipisicing elit. Sunt, asperiores! Quaerat quo ut esse, commodi inventore aliquam excepturi, facere in ea rerum debitis neque voluptatem?

{# French content starts here #}
{% set lang = "fr" %}
## Il s'agit d'une rubrique français de niveau 2

Le contenu va ici : Lorem ipsum dolor sit amet, consectetur adipisicing elit. Ut dolores animi dolorum aut nulla nemo exercitationem nisi aspernatur, accusamus sit quaerat harum sed reiciendis laborum minus iusto atque quae est.

This thread is an interesting one, and it reminded me of a similar problem. I wanted to provide some ideas to either a custom approach or a potentially plugin-able approach: you could make use of Eleventy's pagination feature in order to render one output file per locale, without having one input file per locale (as long as you don't need to get pagination for each locale as well, since nested pagination is currently hacky, at best).

I made a proof of concept here after somebody asked for a similar thing on the Eleventy Discord: https://github.com/chriskirknielsen/test-eleventy-i18n-as-pagination.

However it has all the content in a data file instead of in the input file itself, so it may be a little less practical for long form content. That could be solved using a modified version of Shawn's idea above, moved into the input file instead of the data file:

{# English content starts here #}
{% if lang == "en" %}
## This is an English heading level 2

Content goes here: Lorem ipsum dolor sit amet consectetur adipisicing elit. Sunt, asperiores! Quaerat quo ut esse, commodi inventore aliquam excepturi, facere in ea rerum debitis neque voluptatem?
{% endif %}

{# French content starts here #}
{% if lang == "fr" %}
## Il s'agit d'une rubrique français de niveau 2

Le contenu va ici : Lorem ipsum dolor sit amet, consectetur adipisicing elit. Ut dolores animi dolorum aut nulla nemo exercitationem nisi aspernatur, accusamus sit quaerat harum sed reiciendis laborum minus iusto atque quae est.
{% endif %}

Just throwing the idea out there in case it helps anyone. Happy to assist in making this into a concrete solution if anyone's interested.

@adamduncan

I think these two points are where our understanding is diverging. I'd have thought that these statements would be necessary/desirable steps should the content need to change.

I can only speak for myself but that's not desirable for me. Coming from another i18n framework, it's very rare in my experience that you would want to have a different HTML layout for different locales. As far as the file structure is concerned, changes to the code here are usually locale independent.

For example, if I want to change an image path in an index.njk file I need to make these changes n times (for each locale), although this has nothing to do with localization. For more than 2 locales this becomes increasingly more prone to human error (forgetting to update a file, not making the exact same changes to each locale where they should be the same), which can actually cause these files to diverge in layout over time. This increases chore work which might discourage some project to add more locales to their website.

I think that's why many i18n frameworks opt into an approach where you have one source file that reads the translation string from a JSON file and takes things from there, e.g.

<!-- index.html >> en, es, ... -->
<p> { t('hello') } </p>

That's probably the content negotiation 11ty is talking about in the documentation. While there is also lots of merit in your design, I think ultimately some of us are looking for a different solution.

Interesting approach, @shawnthompson @chriskirknielsen. It's a tricky one, as it's a fundamentally different approach to the route taken by this plugin and 11ty docs. Agree it starts to become less practical for long-form content, which this plugin was really designed around.

I don't think I'd see the current and the "consolidated" single-file approach co-existing in a single plugin without introducing a bit of complexity in the library and for consumers 🤔 It feels like shifting that from one place (managing multiple site trees) to another (managing potentially large amounts of content for potentially many languages in one file). The former feels more aligned with the 11ty approach to generating pages, and I'd still lean to it being right approach when we're talking about unstructured Markdown content.

@StefanGreve Yep, this is what I was trying to deduce with my original comment. Agree it's not an ideal solution for layouts (i.e. structured HTML that should be content-agnostic — e.g. 11ty docs layouts example), but think we're conflating that with content (i.e. longer form unstructured Markdown content — e.g. 11ty's docs content example). I'd not usually expect folks to take the approach of multiple language-specific versions of layout files (though they might have some pieces of microcopy dotted throughout them — where our i18n filter is doing the job you're referring to).

It sounds like there is a tricky middle-ground some are finding themselves in where the content pages leverage a lot of structural HTML (the example @shawnthompson shared). There we have quite HTML-rich content, and fair to say making the same markup amend across many language pages is unavoidably laborious.

Though I'm not sure consolidating multiple language versions of content into the same file and same file-tree solves that problem? I.e. The change would still need to be made in multiple places, albeit in a single, very large file? In that scenario one might look to abstract the content itself into a structured data format, and using a layout to extract the wrapping markup to a single language-agnostic source of truth.

@adamduncan, we are converting all our HTML-rich content to structured Markdown as much as possible and looking at ways to make it easier for others to contribute to our project, using Markdown and the Github web interface to update pages. This is why I was looking at ways to put the language versions in one file. I'm doing an overhaul of the full structure and site architecture.

Now to think of it, these single files would be quite large. I'm wondering if it would be unmanageable.

I wonder if a folder for every page with language variants would be better in my case.

[root],
└─ src,
   ├─ guides,
   │  └─ design-accessible-services,
   │     ├─ en.md,
   │     └─ fr.md,
   │  └─ ict-requirements,
   │     ├─ en.md,
   │     └─ fr.md,
   │  └─ improving-form-accessibility,
   │     ├─ en.md,
   │     └─ fr.md,
   ├─ en.md,
   └─ fr.md,

I'm also looking to use tags: to build all the main pages to the site, at the same time having collections of topics giving the user multiple ways to navigate our site depending on what they are looking for.