ToastyNews's repositories

electra-hongkongese

Pre-trained ELECTRA from Hong Kong data

Language:PythonLicense:Apache-2.0Stargazers:26Issues:0Issues:0

hong-kong-fastText

fastText vectors created from Hong Kong data.

License:CC-BY-4.0Stargazers:20Issues:0Issues:0

openrice-senti

Scraped reviews from OpenRice for sentiment analysis. Formatted to use with BERT.

License:CC-BY-4.0Stargazers:8Issues:0Issues:0

cantonese-nlp-benchmark

Benchmark for Cantonese word segmentation and pos tagging

Language:PythonLicense:MITStargazers:6Issues:0Issues:0

hongkongese-identifier

Simple statistical detector for Hong Kongese/Standard Chinese/English languages.

Language:PythonLicense:MITStargazers:5Issues:0Issues:0

lihkg-cat-v2

Scraped forum threads from LIHKG for categorization task. Formatted to use with BERT.

License:CC-BY-4.0Stargazers:5Issues:0Issues:0

wordshk-sem

Scraped word definition pairs from words.hk for semantic similarity task. Formatted to use with BERT.

License:NOASSERTIONStargazers:4Issues:0Issues:0

ckip-transformers-hk

Hongkongese/Cantonese models compatible with CKIP Transformers

Language:PythonLicense:GPL-3.0Stargazers:2Issues:0Issues:0

finetune-ckip-transformers

Create training files to fine-tune CKIP Transformers

Language:PythonLicense:Apache-2.0Stargazers:1Issues:0Issues:0

pytorch-sentiment-analysis

Hong Kongese deep learning data set and notebooks forked from bentrevett/pytorch-sentiment-analysis tutorial

Language:Jupyter NotebookLicense:MITStargazers:1Issues:0Issues:0

fastText4j

Facebook's FastText for Java

Language:JavaLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

hong-kong-bleu

Data for evaluating translation APIs using Hong Kong text.

Language:PythonLicense:CC-BY-4.0Stargazers:0Issues:0Issues:0

lihkg-cat

Scraped forum threads from LIHKG for categorization task. Formatted to use with BERT.

License:CC-BY-4.0Stargazers:0Issues:0Issues:0