l2009312042's starred repositories

peoples-speech

The People’s Speech Dataset

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:94Issues:0Issues:0

room-impulse-responses

A list of publicly available room impulse response datasets and scripts to download them.

Language:ShellStargazers:347Issues:0Issues:0

cv-arxiv-daily

🎓Automatically Update CV Papers Daily using Github Actions (Update Every 12th hours)

Language:PythonLicense:Apache-2.0Stargazers:771Issues:0Issues:0

charsiu

Charsiu: A neural phonetic aligner.

Language:Jupyter NotebookLicense:MITStargazers:256Issues:0Issues:0

LibriPhrase

Recipe for LibriPhrase

Language:PythonLicense:MITStargazers:22Issues:0Issues:0
Language:PythonStargazers:3Issues:0Issues:0

rubberband

Official mirror of Rubber Band Library, an audio time-stretching and pitch-shifting library.

Language:C++License:GPL-2.0Stargazers:533Issues:0Issues:0

Speech-Resources

语音方向实验室/公司/资源/实习等,欢迎推荐或自荐

Stargazers:438Issues:0Issues:0

Inter-SubNet

The official PyTorch implementation of "Inter-SubNet: Speech Enhancement with Subband Interaction", accepted by ICASSP 2023.

Language:PythonLicense:Apache-2.0Stargazers:85Issues:0Issues:0

spiking-fullsubnet

Official repository of Spiking-FullSubNet, the Intel N-DNS Challenge Algorithmic Track Winner.

Language:PythonLicense:MITStargazers:43Issues:0Issues:0

small-footprint-keyword-spotting

Effective processing pipeline and advanced neural network architectures for small-footprint keyword spotting

Language:PythonStargazers:6Issues:0Issues:0

sgmse

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation

Language:PythonLicense:MITStargazers:407Issues:0Issues:0

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:4057Issues:0Issues:0

RobustConformer

Robust speech recognition using teacher-student learning

Language:PythonStargazers:2Issues:0Issues:0

DPSL-ASR

Code for paper "Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition"

Language:PythonLicense:Apache-2.0Stargazers:34Issues:0Issues:0

sentencepiece_chinese_bpe

使用sentencepiece中BPE训练中文词表,并在transformers中进行使用。

Language:PythonStargazers:93Issues:0Issues:0

AudioGPT

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Language:PythonLicense:NOASSERTIONStargazers:9841Issues:0Issues:0

SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

Language:PythonLicense:MITStargazers:1065Issues:0Issues:0

EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Language:PythonLicense:Apache-2.0Stargazers:6531Issues:0Issues:0

INTERSPEECH-2023-Papers

INTERSPEECH 2023 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!

License:MITStargazers:595Issues:0Issues:0

NKF-AEC

Acoustic Echo Cancellation with Nerual Kalman Filtering

Language:HTMLStargazers:191Issues:0Issues:0

awesome_LLMs_interview_notes

LLMs interview notes and answers:该仓库主要记录大模型(LLMs)算法工程师相关的面试题和参考答案

License:MITStargazers:1062Issues:0Issues:0

algorithm-journey

算法通关课的代码和课件

Language:JavaStargazers:879Issues:0Issues:0

leetcode-master

《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀

Language:ShellStargazers:48169Issues:0Issues:0

VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

Language:PythonLicense:MITStargazers:7334Issues:0Issues:0

ego2022

JOINT EGO-NOISE SUPPRESSION AND KEYWORD SPOTTING ON SWEEPING ROBOTS

Language:MATLABStargazers:24Issues:0Issues:0

KAN-TTS

KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech

Language:PythonLicense:MITStargazers:449Issues:0Issues:0

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Language:PythonLicense:BSD-4-ClauseStargazers:9591Issues:0Issues:0

WavAugment

A library for speech data augmentation in time-domain

Language:PythonLicense:MITStargazers:630Issues:0Issues:0

porcupine

On-device wake word detection powered by deep learning

Language:PythonLicense:Apache-2.0Stargazers:3546Issues:0Issues:0