Miao Zhang (ZenMule)

ZenMule

Geek Repo

Company:University of Zurich

Location:Zurich

Home Page:miaozhang.org

Github PK Tool:Github PK Tool

Miao Zhang's starred repositories

jieba

结巴中文分词

Language:PythonLicense:MITStargazers:33152Issues:1279Issues:851

MS-DOS

The original sources of MS-DOS 1.25, 2.0, and 4.0 for reference purposes

Language:AssemblyLicense:MITStargazers:30644Issues:747Issues:0

pkuseg-python

pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation

Language:PythonLicense:MITStargazers:6517Issues:208Issues:165

python-pinyin

汉字转拼音(pypinyin)

Language:PythonLicense:MITStargazers:4837Issues:99Issues:263

common-voice

Common Voice is part of Mozilla's initiative to help teach machines how real people speak.

Language:TypeScriptLicense:MPL-2.0Stargazers:3293Issues:132Issues:2244

ckiptagger

CKIP Neural Chinese Word Segmentation, POS Tagging, and NER

Language:PythonLicense:GPL-3.0Stargazers:1633Issues:66Issues:40

Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi

Language:PythonLicense:MITStargazers:1306Issues:36Issues:707

forced-alignment-tools

A collection of links and notes on forced alignment tools

Language:PythonLicense:NOASSERTIONStargazers:866Issues:38Issues:6

fugashi

A Cython MeCab wrapper for fast, pythonic Japanese tokenization and morphological analysis.

Language:C++License:MITStargazers:389Issues:7Issues:73

nagisa

A Japanese tokenizer based on recurrent neural networks

Language:PythonLicense:MITStargazers:382Issues:12Issues:30

revealjs

R Markdown Format for reveal.js Presentations

Language:JavaScriptLicense:NOASSERTIONStargazers:325Issues:17Issues:126

g2pW

Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)

Language:PythonLicense:Apache-2.0Stargazers:277Issues:5Issues:17

g2pK

g2pK: g2p module for Korean

Language:PythonLicense:Apache-2.0Stargazers:233Issues:5Issues:7

konoha

🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.

Language:PythonLicense:MITStargazers:226Issues:7Issues:40

KoG2P

Korean grapheme-to-phone conversion in Python

Language:PythonLicense:GPL-3.0Stargazers:127Issues:5Issues:2

ccma

Curvature Corrected Moving Average: An accurate and model-free path smoothing algorithm.

Language:PythonLicense:BSD-3-ClauseStargazers:107Issues:2Issues:1

jieba-tw

結巴中文斷詞台灣繁體版本

Language:PythonLicense:MITStargazers:99Issues:15Issues:2

PrettyPDF

Quarto extension to generate a PDF with (pretty) LaTeX styling.

Language:TeXLicense:CC0-1.0Stargazers:91Issues:7Issues:3

ipapy

ipapy is a Python module to work with International Phonetic Alphabet (IPA) strings

Language:PythonLicense:MITStargazers:81Issues:2Issues:5

itsp

Introduction to Speech Processing

Language:Jupyter NotebookLicense:CC-BY-SA-4.0Stargazers:62Issues:4Issues:7

dragonmapper

Identification and conversion functions for Chinese text processing

Language:PythonLicense:MITStargazers:56Issues:7Issues:26

commonvoice-utils

Linguistic processing for Common Voice

Language:PythonLicense:AGPL-3.0Stargazers:51Issues:5Issues:25

pinyin-to-ipa

Command-line interface and Python library to transcribe pinyin to IPA. The tones are attached to the vowel of the syllable.

Language:PythonLicense:MITStargazers:30Issues:3Issues:3

trajr

Trajectory Analysis in R

Language:RLicense:NOASSERTIONStargazers:26Issues:7Issues:5

taibun

Taiwanese Hokkien Transliterator and Tokeniser

Language:PythonLicense:MITStargazers:23Issues:2Issues:10

hangul_to_ipa

A dash app that transcribes 한글 into [hɑŋɡɯl].

Language:PythonLicense:MITStargazers:21Issues:2Issues:11

PersianG2P

Persian Grapheme-to-Phoneme (G2P) converter

Language:PythonLicense:MITStargazers:19Issues:1Issues:0

sch-corpus

A Hmong language corpus derived from the soc.culture.hmong Usenet group

License:CC0-1.0Stargazers:5Issues:2Issues:0

yitizi-rs

Get all variants (yitizi, 異體字) of a Chinese character (Sinograph)!

Language:RustStargazers:3Issues:1Issues:0