Jean Du (duj12)

duj12

Geek Repo

Company:Xmov.ai

Location:Shanghai

Github PK Tool:Github PK Tool

Jean Du's repositories

ASR-2Pass

ASR 2Pass onnxruntime and websocket server, based on FunASR(https://github.com/alibaba-damo-academy/FunASR).

cnn-lstm-based-malware-document-classification

use cnn/lstm and ensembling model to classify different documents, according to the api sequences each document calls.

Language:PythonLicense:MITStargazers:12Issues:1Issues:0

ss-vad

self-supervised vad

Language:PythonLicense:MITStargazers:7Issues:2Issues:0

wekws

Production First and Production Ready End-to-End Keyword Spotting Toolkit

Language:PythonLicense:Apache-2.0Stargazers:6Issues:0Issues:0
Language:PythonLicense:MITStargazers:5Issues:0Issues:0

kws_demo

KWS demo based on CTC prefix beam search.

OpenVoice

Instant voice cloning by MyShell

Language:PythonLicense:NOASSERTIONStargazers:1Issues:0Issues:0

ali-kaldi

the ali open source kaldi for dfsmn

Language:ShellLicense:NOASSERTIONStargazers:0Issues:1Issues:0

buzz

Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Coqui-TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonLicense:MPL-2.0Stargazers:0Issues:0Issues:0

duj12

Config files for my GitHub profile.

Stargazers:0Issues:2Issues:0

EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

espnet

End-to-End Speech Processing Toolkit

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

FunASR

A Fundamental End-to-End Speech Recognition Toolkit

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

k2

FSA/FST algorithms, differentiable, with PyTorch compatibility.

Language:CudaLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

radtts

Provides training, inference and voice conversion recipes for RADTTS and RADTTS++: Flow-based TTS models with Robust Alignment Learning, Diverse Synthesis, and Generative Modeling and Fine-Grained Control over of Low Dimensional (F0 and Energy) Speech Attributes.

License:MITStargazers:0Issues:0Issues:0

riva-asrlib-decoder

Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva

Language:PythonStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:1Issues:0

WenetSpeech

A 10000+ hours dataset for Chinese speech recognition

Language:ShellLicense:Apache-2.0Stargazers:0Issues:0Issues:0

WeTextProcessing

Text Normalization & Inverse Text Normalization

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

wetts

Production First and Production Ready End-to-End Text-to-Speech Toolkit

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

whisper.cpp

Port of OpenAI's Whisper model in C/C++

Language:CLicense:MITStargazers:0Issues:0Issues:0