The Unicode Consortium's repositories
lstm_word_segmentation
Python code for training an LSTM model for word segmentation in Thai, Burmese, and similar languages.
jira-github-pr-check
Checks GitHub pull requests for valid and accepted Jira tickets. Used for ICU and CLDR
cldr-implementers-guide
Implementer's Guide for CLDR
ml-confusables-generator
Generates confusables for Han script using ML techniques
icu4x-docs
ICU4X Docs
test-corpora
Corpora in many languages for testing, evaluating, benchmarking, and training Unicode algorithms