Rairye's repositories

zh-sentence

Light-weight sentence tokenizer for Chinese languages.

Language:PythonLicense:NOASSERTIONStargazers:2Issues:1Issues:0

ht-getter

Searches a document for hash tags. Supports multiple natural languages. Works in various contexts.

Language:PythonLicense:NOASSERTIONStargazers:1Issues:1Issues:1

ja-sentence

Light-weight sentence tokenizer for Japanese.

Language:PythonLicense:NOASSERTIONStargazers:1Issues:1Issues:0

js-sentence-tokenizers

JavaScript sentence tokenizers for multiple natural languages.

Language:JavaScriptLicense:NOASSERTIONStargazers:1Issues:1Issues:0

kr-sentence

Light-weight sentence tokenizer for Korean. Supports full-width and half-width punctuation marks.

Language:PythonLicense:NOASSERTIONStargazers:1Issues:1Issues:0

sentence-tokenizers

Sentence tokenizers for several languages

Language:JavaLicense:Apache-2.0Stargazers:1Issues:1Issues:0

thelangbot

Twitter bot to help you learn foreign languages. Building a community through tweets. Retweets #100DaysOfLanguage and #langtwt.

Language:PythonStargazers:1Issues:0Issues:0

back-cleaner

Server-side Python tool for escaping script tags and converting characters into HTML entities (no regex).

Language:PythonLicense:NOASSERTIONStargazers:0Issues:1Issues:0

content_moderation_ideas

A collection of proof-of-concept approaches for using ideas from NLP/text processing to handle content moderation. (Light-weight approaches, no ML)

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

convert-with-ents

Light-weight tool for converting characters in a string into common HTML entities (without regex).

Language:JavaScriptLicense:NOASSERTIONStargazers:0Issues:1Issues:0

freefields-from-string

Code for extracting field-like text from unformatted strings

Language:JavaScriptLicense:NOASSERTIONStargazers:0Issues:1Issues:0

gs-scripts

Samples of .gs scripts

Language:JavaScriptLicense:MITStargazers:0Issues:1Issues:0

rr-search-tries

Trie-based search classes for JavaScript

Stargazers:0Issues:1Issues:0

CPP-samples

C++ samples

Language:C++Stargazers:0Issues:2Issues:0

js-mnl-punct-norm

Light-weight tool for removing punctuation. Supports multiple natural languages. Useful for scrapping, machine learning, and data analysis.

License:NOASSERTIONStargazers:0Issues:1Issues:0

js-mnl-ws-norm

Light-weight tool for normalizing whitespace and accurately tokenizing words. Multiple natural languages supported. Useful for scrapping, machine learning, and data analysis.

Language:JavaScriptLicense:NOASSERTIONStargazers:0Issues:1Issues:0

ko-ww-stopwords

Set of whole-word (independent) stop words in Korean

Language:PythonLicense:NOASSERTIONStargazers:0Issues:1Issues:0

mnl-punct-norm

Light-weight tool for removing punctuation. Supports multiple natural languages. Useful for scrapping, machine learning, and data analysis.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:1Issues:0

mnl-ws-norm

Light-weight tool for normalizing whitespace and accurately tokenizing words (no regex). Multiple natural languages supported. Useful for scrapping, machine learning, and data analysis.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:1Issues:0

RairyeTrieSample

トライ木の実装のサンプル(オートコンプリート辞書)Sample implementation of trie (as auto-complete dictionary)

Language:PythonStargazers:0Issues:1Issues:0

sentence-tk-checker

Checks output of an English sentence tokenizer and modifies the output according to default or user-defined rules.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:1Issues:0

st-no-love

Tool for escaping script tags using backslashes (no regex).

Language:JavaScriptLicense:NOASSERTIONStargazers:0Issues:1Issues:0

TranslationQATools

Java / Swing / Apache POI 翻訳の品質確保ツール

Language:JavaStargazers:0Issues:1Issues:0

TwoLanguageFormOutputFromSingleLanguageInput

(React Native, JavaScript) 単数の言語の入力により、二つの言語でフォームを出力するためのアプリです。App for outputting forms in two languages from single-language user input.

Stargazers:0Issues:1Issues:0
Stargazers:0Issues:1Issues:0