danfu's starred repositories

ddia

《Designing Data-Intensive Application》DDIA中文翻译

Language:PythonLicense:CC-BY-4.0Stargazers:19539Issues:363Issues:71

kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Language:ShellLicense:NOASSERTIONStargazers:13911Issues:698Issues:1636

fancyss_history_package

科学上网插件的离线安装包储存在这里

License:GPL-3.0Stargazers:10237Issues:435Issues:0

sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.

Language:C++License:Apache-2.0Stargazers:9800Issues:122Issues:733

cleanlab

The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

Language:PythonLicense:AGPL-3.0Stargazers:8965Issues:86Issues:359

vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7361Issues:119Issues:1464

ECDICT

Free English to Chinese Dictionary Database

Language:PythonLicense:MITStargazers:5723Issues:124Issues:107

pycorrector

pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,LLaMA等模型应用在纠错场景,开箱即用。

Language:PythonLicense:Apache-2.0Stargazers:5328Issues:85Issues:456

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language:PythonLicense:NOASSERTIONStargazers:3955Issues:48Issues:841

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:PythonLicense:Apache-2.0Stargazers:3874Issues:90Issues:987

space-vim

:four_leaf_clover: Lean & mean spacemacs-ish Vim distribution

Language:Vim ScriptLicense:MITStargazers:2852Issues:64Issues:324

OFA

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Language:PythonLicense:Apache-2.0Stargazers:2370Issues:21Issues:359

cnnimageretrieval-pytorch

CNN Image Retrieval in PyTorch: Training and evaluating CNNs for Image Retrieval in PyTorch

Language:PythonLicense:MITStargazers:1409Issues:33Issues:76

Deep_Metric

Deep Metric Learning

Language:PythonLicense:Apache-2.0Stargazers:774Issues:35Issues:49

Speech-enhancement

Deep learning for audio denoising

Language:PythonLicense:MITStargazers:607Issues:19Issues:20

chinese_text_normalization

Chinese text normalization for speech processing

Language:PythonLicense:MITStargazers:596Issues:15Issues:11

word-discovery

速度更快、效果更好的中文新词发现

research-ms-loss

MS-Loss: Multi-Similarity Loss for Deep Metric Learning

Language:PythonLicense:NOASSERTIONStargazers:486Issues:22Issues:23

g2pm

A Neural Grapheme-to-Phoneme Conversion Package for Mandarin Chinese Based on a New Open Benchmark Dataset

Language:PythonLicense:Apache-2.0Stargazers:333Issues:15Issues:18

multigrain

Code for "MultiGrain: a unified image embedding for classes and instances"

Language:PythonLicense:NOASSERTIONStargazers:229Issues:16Issues:11

pychain

PyTorch implementation of LF-MMI for End-to-end ASR

Language:C++Stargazers:216Issues:28Issues:0

cn-text-normalizer

A python module that convert chinese written string to read string. 一个python包:将中文书面字符串转换为口语字符串。

Language:PythonLicense:MITStargazers:120Issues:7Issues:2

KeSpeech

The repo provides information about KeSpeech dataset.

pkwrap

A pytorch wrapper for LF-MMI training and parallel training in Kaldi

Language:PythonLicense:NOASSERTIONStargazers:72Issues:12Issues:22

speech-to-text

mixlingual speech recognition system; hybrid (GMM+NNet) model; Kaldi + Keras

Language:Jupyter NotebookStargazers:70Issues:8Issues:14

multi-task-kaldi

An example directory for running Multi-Task Learning training on Kaldi neural networks. In Kaldi-speak, this is an egs dir for nnet3 training.

Language:ShellLicense:Apache-2.0Stargazers:54Issues:5Issues:5

goparrot

Goodness of Pronunciation (GOP) for oral reading assessment.

multistream-cnn

Multistream CNN for Robust Acoustic Modeling

Language:ShellLicense:NOASSERTIONStargazers:39Issues:4Issues:1

Text_Normalization

A text normalization framework using GBM and human-generated features

Language:PythonStargazers:9Issues:0Issues:0