Beast code in Giters

INTERSPEECH 2023 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!

MIT61700

MP-SENet

MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra

Language:PythonMIT26300

whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Language:Jupyter NotebookBSD-2-Clause247400

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause1247200

chatgpt-on-wechat

基于大模型搭建的聊天机器人，同时支持微信公众号、企业微信应用、飞书、钉钉等接入，可选择GPT3.5/GPT-4o/GPT4.0/ Claude/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Claude/Kimi/LinkAI，能处理文本、语音和图片，访问操作系统和互联网，支持基于自有知识库进行定制企业智能客服。

Language:PythonMIT2843800

torchcrepe

Pytorch implementation of the CREPE pitch tracker

Language:PythonMIT39000

AIGC-progress

Follow the rapid development of AIGC models and applications. | 跟上AIGC模型和应用快速发展的步伐 🚀

8100

MiniVox

Code for our ACML and INTERSPEECH papers: "Speaker Diarization as a Fully Online Bandit Learning Problem in MiniVox".

Language:Cuda2500

online_speaker_diarization

Language:Perl1300

wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Language:PythonApache-2.060100

Speaker-Diarization

speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition

Language:PythonApache-2.045800

wesignal

Production first, nn-based on-device signal processing toolkit.

Apache-2.06300

AcademiCodec

AcademiCodec: An Open Source Audio Codec Model for Academic Research

Language:Python53600

deep-speaker

Deep Speaker: an End-to-End Neural Speaker Embedding System.

Language:PythonMIT89700

RBN

The official repo of the CVPR2021 oral paper: Representative Batch Normalization with Feature Calibration

Language:Python8500

Audio-Effects

Collection of audio effects plugins implemented from the explanations in the book "Audio Effects: Theory, Implementation and Application" by Joshua D. Reiss and Andrew P. McPherson.

Language:C++69800

AudioGPT

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Language:PythonNOASSERTION990700

3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Language:PythonApache-2.094800

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonApache-2.0828000

SDCM

Language:Python2300

vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

Language:PythonMIT219800

IntelNeuromorphicDNSChallenge

Intel Neuromorphic DNS Challenge

Language:Jupyter NotebookMIT12000

Large-Audio-Models

Keep track of big models in audio domain, including speech, singing, music etc.

41600

IMYBo

Cyril Lv's starred repositories

UniAudio

coder2gwy

BS-RoFormer

versatile_audio_super_resolution

SpatialCodec

cocopilot

padasip

INTERSPEECH-2023-Papers