mihxil / i18n-iso-639

Provides a java class 'LanguageCode' with instances for all available codes in ISO-639-3

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

JAVA ISO-639 support

javadoc Build Status Maven Central snapshots codecov

Codes for Languages (and language groups) of the World are covered by the ISO-639 standard

These standards provides letter codes for each language. E.g. ISO-639-3 provides a three letter code for all living languages.

There are too many such codes to be contained in a java-enum (e.g. https://github.com/TakahikoKawasaki/nv-i18n/blob/master/src/main/java/com/neovisionaries/i18n/LanguageAlpha3Code.java is just not complete)

This package has the tab seperated files provided by https://iso639-3.sil.org/, and java classes to read this, and provide all language codes as java objects, with getters.

Usage

       import org.meeuw.i18n.languages.*;
   // get a language by its code;
        Optional<LanguageCode> optional = ISO_639.getByPart3("nld");
        LanguageCode languageCode = LanguageCode.languageCode("nl");

        // show it 'inverted' name
        System.out.println(languageCode.nameRecord(Locale.US).inverted());

        // get a language family
        Optional<LanguageFamilyCode> family = ISO_639.getByPart5("ger");

        // get by any code
        Optional<ISO_639_Code> byCode = ISO_639.get("nl");

        // stream by names, language may have several names (dutch, flemish), and appear multiple times
        ISO_639_Code.streamByNames().forEach(e -> {
            System.out.println(e.getKey() + " " + e.getValue());
        });

See also the test cases

link:src/test/java/org/meeuw/i18n/languages/test/LanguageCodeTest.java[role=include]

Retired codes

LanguageCode#getByCode will also support retired codes if possible. This means that the code of the returned object may be different:

// the 'krim' dialect (Sierra Leone) officially merged into 'bmf' (Bom-Kim) in 2017

 assertThat(LanguageCode.getByCode("krm").get().getCode()).isEqualTo("bmf");

Fall backs

Sometimes we have to deal with systems which have their own versions of the standards. In these cases it is possible to register 'fall backs'.

E.g.

 // Our partner uses the pseudo ISO-639-1 code 'XX' for 'no language'
 //  fall back to a proper Part 3 code.
 try {
   LanguageCode.registerFallback("XX", LanguageCode.languageCode("zxx"));
   assertThat(ISO_639.iso639("XX").code()).isEqualTo("zxx");
 } finally {
   LanguageCode.resetFallBacks();}

Support

JAXB

The language code is annotated with a JAXB annotation. It will serialize and deserialize to and from the code. The dependency on the annotation is optional.

JSON

The needed classes are also annotated by Jackson annotations, so they can be serialized to and from JSON.

Serializable

The LanguageCode is serializable, and ensures that on deserialization the same object for every language is returned. (And only the code is non transient).

Sortable

The default sort order of a LanguageCode used to on 'Inverted Name'. There may be more than one (inverted) name though (E.g. Dutch and Flemish). Since 3.0 LanguageCode is not Sortable anymore. LanguageCode#stream() is sorted by ISO-639-3 code.

Versions

<1

developing/testing

2023

1.x

compabible with java 8, javax.xml, module-info java 11

1.0

2023-11-30

2.x

java 11, jakarta.xml

2024-01-28

jakarta mostly applies to the optional jaxb support (and to some - also optional - validation annotations)

2.1

support for retired codes

2024-02-11

2.2

migrated support for language code validation from i18n-regions

2024-?

3.0

Refactoring

2024-3

Added enum for ISO-639-1 codes, Made syntax forward compabible with records. So, getters like getPart1()) are dropped in favor of part1(). LanguageCode itself is now an interface. This may be backported to 1.2 for javax compatibility.

3.1

Refactoring

2024-3

Support for ISO-639-5. Dropped the -3 from the artifact id.

About

Provides a java class 'LanguageCode' with instances for all available codes in ISO-639-3


Languages

Language:Java 95.4%Language:Groovy 4.6%