Tool to collect and review sentences for Common Voice
Do Thai first voters on Twitter tend to be in their own echo chambers?
Thai Natural Language Processing in Python.
Faker is a Python package that generates fake data for you.
Official Stanford NLP Python Library for Many Human Languages
Scraping Wikipedia for fair use sentences
Common Voice is part of Mozilla's initiative to help teach machines how real people speak.
Standalone Dictionary-based, Maximum Matching + Thai Character Cluster (newmm) tokenizer extracted from PyThaiNLP
A step-by-step tutorial for publishing data and an ontology as Linked Data on your machine.
More than 50+ collections of Thai Natural Language Processing libraries. Update daily.
Thai Common Voice sentences
Command line tool to create corpora for Common Voice
My GitHub Profile
A curated list of references for MLOps
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Currently, the Largest Dataset for Thai Text Summarization with over 310K articles.
Automatic extraction of edited sentences from text edition histories.
A synthetic data generator for text recognition
Because DimSum is taken by Aone et al. (1997) already.
A fast and accurate POS and morphological tagging toolkit (EACL 2014)
Test some ideas on NLP, mostly for Thai language
Play with data from social media