Sheng Li's repositories
chinese-poetry
最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。
cino
CINO: Pre-trained Language Models for Chinese Minority (少数民族语言预训练模型)
Classical-Modern
非常全的文言文(古文)-现代文平行语料
CNTK
Computational Network Toolkit (CNTK)
e2e_lfmmi
This is the implementation of paper CONSISTENT TRAINING AND DECODING FOR END-TO-END SPEECH RECOGNITIONUSING LATTICE-FREE MMI submitted to ICASSP2022
EasyEspnet
Making Espnet easier to use
eesen
The official repository of the Eesen project
Forced-Alignment
GSoC'16 RedHen Labs
julius
Open-Source Large Vocabulary Continuous Speech Recognition Engine
kaldi
This is now the official location of the Kaldi project.
khmer-language-model-ulmfit
Khmer Language Model using ULMFiT
KhmerWordSegmentation
Separate Khmer words from given sentences.
Thai-NLP-Dataset
More than 43+ collections of Thai Natural Language Processing libraries. Update daily.
tuning_playbook
A playbook for systematically maximizing the performance of deep learning models.
voxlingua107_sb
VoxLingua107 recipe for SpeechBrain
warp-ctc
Fast parallel CTC.