Allows writing simple scripts to manage locale/translation files.
The core idea behind this project is:
- Provide a powerful yet flexible, extensible tool able to support arbitrary text types/formats 💪
- Allow writing short and easy-to-use scripts 😊
- It's free! 💃
And of course, exploiting Google Docs provides versioning and collaborating for no charge!
Here is the simplest usage example:
from translinguer import Translinguer
document = Translinguer().load_from_gsheets(name="My Translations")
document.save_cfg_by_language_page("__path__")
As a startuper and indi game developer, I created lots of apps with multiple languages support, constantly met a problem of platform/os/framework specific text files formats and a need of manual dealing with then (which is especially painful for cross-platform games support like TacticToy). I found no universal tool that fits all my requirements (excepting paid services), so I wrote separate scripts before I finally got enough (pain and) experience, understanding to create a comprehensive tool like Translinguer.
Originally developed for the Destiny Garden game and widely enhanced while developing Factorio mods, now I hope it can benefit other developers on various platforms!
My typical translating process consists of these steps:
- Optionally: Import existing raw locale files (json, ini, cfg, csv) and upload to Google Sheets
- Write/update texts in a Google Sheet (with myself, friends and volunteers, or Google Translate and ChatGPT)
- Download and parse source texts from the Google Sheet
- Optionally: Save into local cache
- Export texts to result locale files of required format (json, ini, cfg, csv)
But you may have a really different process, and Translinguer will still suit your needs!
I'm a big fan on typing and robust decomposition! To support all the needs I met, I designed 5-layer structure: (1) a document consists of (2) pages that consist of (3) sections which consist of (4) entries that consist of (5) texts by language.
- languages:
LanguageList = List[str]
- texts:
Locales = Dict[PageName: str, Page]
- name:
str
- sections:
Dict[SectionName: str, Section]
- languages:
Optional[LanguageList]
- name:
str
– may be an empty string - entries:
Dict[key: str, Entry]
- key:
str
- by_language:
Dict[Language: str, text: str]
Most of the translation file formats support sections and comments. Translinguer allows this even for tables like CSV, Google Sheets.
Consider the following table:
key | Eng | Rus |
---|---|---|
# This is | a comment | |
greeting | Hello, world! | Привет, мир! |
farewell | Good bye! | Всего хорошего! |
[some-section] | ||
j1.1 | In the beginning was the word | В начале было слово |
Let it be a sheet named "my-texts" in a Google Sheets document. After parsing, it will become a Translinguer document with internal structure looking like this:
{
"languages": ["Eng", "Rus"],
"pages": {
"my-texts": {
"languages": ["Eng", "Rus"],
"sections": {
"": { # Default nameless section
"entries": {
"greeting": {
"Eng": "Hello, world!",
"Rus": "Привет, мир!",
},
"farewell": {
"Eng": "Good bye!",
"Rus": "Всего хорошего!",
},
}
},
"some-section": {
"entries": {
"j1.1": {
"Eng": "In the beginning was the word",
"Rus": "В начале было слово",
},
}
},
}
}
},
}
On converting to CFG files, it will become /en/my-texts.cfg
:
greeting=Hello, world!
farewell=Good bye!
[some-section]
j1.1=In the beginning was the word
And /ru/my-texts.cfg
:
greeting=Привет, мир!
farewell=Всего хорошего!
[some-section]
j1.1=В начале было слово
Comments and sections syntax can be easily change with methods arguments.
Notice that only a whole row may be specified as a comment, you shouldn't use it elsewhere; btw, Google Sheets provide notes and comments which aren't content of the table, I recommend utilise it.
Here is a small script performing all these things:
from translinguer import Translinguer
document = Translinguer(lang_mapper={
'Eng': 'en',
'Rus': 'ru',
})
document.load_from_gsheets(key="__XYZ__")
print(document.stats)
document.validate(raise_error=True)
document.save_cfg_by_language_page("__output_folder__")
The lang_mapper
argument isn't required, but allows having different language names in source tables
and raw/result files, which is more readable and just fancy! ✨
The majority of methods print logs to stdout for a user to know what's going on.
Here is usage for MD2R, one of my open source Factorio mods:
You can have multiple source files for one project uniting them into single Transliguer document. This is useful if you want separate translators access to the files (idk why) or store some pages / sections independently. To achieve it, simply load each file with the same Translinguer instance:
trans = Translinguer()
trans.load_from_gsheets(name="My-Project-Eng")
trans.load_from_gsheets(name="My-Project-Tur")
trans.load_from_gsheets(name="My-Project-Ukr")
You can work with multiple projects in a single document on separate pages with different languages. This may be useful for several small or related sets of texts so that translators can work on them in one place.
To achieve this, you can either download pages of a single project only:
projects = Translinguer()
projects.load_from_gsheets(name="Many-Projects", page_filter={"project_1"})
projects.save_cfg_by_language_page("__output_folder__")
Or download all the pages but export specific pages only:
project1 = Translinguer()
project1.load_from_gsheets(name="Many-Projects")
project1.save_cfg_by_language_page("__output_folder__", page_filter={"project_1"})
Actually, you can combine 3&4, download single page from different source files... More flexibility for the Flexibility God! 😈
You can easily customise or extend Translinguer with:
- New formats for reading/writing – just have a look at the source code.
- Your own validation logic, similarly to embedded
validate
method. Example use cases:- To check that all entries from your source code are present. You can see it in the real example mentioned above.
- To ensure some project-specific consistency
Already wanna try it out? 😁
pip install Translinguer
Additional dependency in case you also would like to use Google Sheets:
pip install gspread
Then you'll need to authenticate into Google Drive & Sheets.
To do this, create a service account, download its credentials
(preferably saving to gsheets-credentials.json
),
assign it access to these services, share access to the files you want.
You can find more details here:
Here is a specification of Translinguer class 🧩
pages: [page_name: str, Page]
– main data storageentries_number: int
– returns total number of entriestexts_number: int
– returns total number of texts in entriesstats: str
– returns a detailed string to print
languages: List[str], optional
– generally there is no need to set this manually.lang_mapper: LangRenamer = Dict[str, str], optional
– allows to have different language names in source tables and raw/result files. Defaults toProxyDict
which stores nothing and simply returns given key as a value.
Checks each entry looking missing language texts. Returns number of errors, optionally raises exception.
raise_error: bool = False
Used to store fully serialized document data.
Filename can be specified with an argument, defaults to DEFAULT_CACHE_FILE = 'texts.json'
Converts whole document in a pure Python object.
Loads document data from a dict.
data: DocumentDict = Dict[str, Dict[...]]
Saves document into self.cache
file.
filename: str, optional = DEFAULT_CACHE_FILE
– cache filename
Loads document from self.cache
file.
filename: str, optional = DEFAULT_CACHE_FILE
– cache filename
Updates current document with texts from specified Google Sheet table.
Client object can be provided manually or will be created using
DEFAULT_GSHEETS_CRED_FILE = 'gsheets-credentials.json'
to authenticate.
Sections and comments are supported, their syntax can be configured with method arguments.
Note that either name
or key
must be provided.
client: gspread.Client, optional
– gspread client objectname: str, optional
– Google sheet filenamekey: str, optional
– Google sheet URL keypage_filter: set[str], optional
– Parses sheets with specified names onlymerge_pages: str, optional
– Merge sheets into one page of given namecomment_prefix: str = '#''
– If the first column starts with this, the line is considered as a commentsection_prefix: str = '['
– If the first column starts with this, it is considered as a section declarationsection_postfix: str = ']'
– Section postfix to clean its name
Exports document (whole or partially) to a CSV string. It is useful for parsing raw text files to upload them into Google Sheet. Sections are supported, their syntax can be configured with method arguments.
lang_mapper: FlexibleRenamer, optional
page_filter: set[str], optional
sections: bool = False
– If to write sectionssection_prefix: str = '['
– If the first column starts with this, it is considered as a section declarationsection_postfix: str = ']'
– Section postfix to clean its namedelimiter: str = '\t'
– csv delimiter
The method saves texts into {output_path}/{language}.{ext}
files. Note that pages get merged.
output_path: str
lang_mapper: FlexibleRenamer, optional
page_filter: set[str], optional
ext: str = 'ini'
The method saves texts into {output_path}/{language}/{page_name}.cfg
files.
output_path: str
lang_mapper: FlexibleRenamer, optional
page_filter: set[str], optional
The method loads texts from {input_path}/{language}/{page_name}.cfg
files.
input_path: str
lang_mapper: FlexibleRenamer, optional
Translinguer lib has some inner types, utility objects and functions.
Classes:
EntryDict
,SectionDict
,PageDict
,DocumentDict
– used for serialization, inherited fromTypedDict
.Entry
,Section
,Page
– actual content ofTranslinguerBase
.ProxyDict
– stores nothing and simply returns given key as a value.
Types:
LangRenamer = Mapping[source: str, raw: str]
– mapping to keep different language names for raw and result files, used as init parameter, may be aProxyDict
.FlexibleRenamer = Union[None, LangRenamer, Callable[[dict], dict]]
– optionally, the same mapping or a function to adjust it, notablydict_reversed
– used to parse raw/result files.PageFilter = Optional[set[str]]
– an alias for a set of page names to pick on texts exporting.LanguageList = List[str]
– an alias for a list of language names as string.Locales = Dict[page_name: str, Page]
– an alias for a dict ofPage
objects.
General:
- Describe data structure
- Describe existing methods and supported formats
- Publish to PyPI
Core:
- Replace dicts in dicts with lists of dedicated classes
- Allow to have separate languages for pages
- Add unit-tests
- Add CI
- Add Google Translate / ChatGPT API usage
- Allow import multiple files with different languages (get rid of
self.pages =
outside of base init)
Formats:
- GSh: add page name mapping
- GSh: add saving
- CSV: add reading
- Export to iOS / Android locale files. With multiple mappings from key to components?
Feel free to make PRs! Just follow the guide below. Also, you can join my Discord to discuss anything.
-
Publish file type if it belongs to a popular platform or framework, no need of project-specific formats.
-
On adding new formats support, follow existing mixin classes type hinting approach and methods naming convention described in the next section.
-
"Private" methods should have file type in their names to avoid collisions.
-
Type-specific settings must be passed as method arguments (see
...
in the naming convention), not into class properties. -
If you add tabular format, make sure to use arguments
comment_prefix, section_prefix, section_postfix
. Use existing GSh and CSV as a reference. -
On document saving/loading (cases 2 & 3 in naming convention) print these events to the console, similarly to other formats methods.
You can use existing add_TYPE.py
files as templates.
This is naming and signature convention of file types public methods.
from_TYPE...(content: str, ...)
to_TYPE...(lang_mapper: FlexibleRenamer, ...) -> str
Typically, these deal with multiple raw/result files at once.
load_TYPE...(input_path: str, ...) -> self
save_TYPE...(output_path: str, lang_mapper: FlexibleRenamer, ...)
load_from_TYPE...(...) -> self
save_to_TYPE...(..., lang_mapper: FlexibleRenamer)