Cyril Lv's starred repositories
BS-RoFormer
Implementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labs
versatile_audio_super_resolution
Versatile audio super resolution (any -> 48kHz) with AudioSR.
INTERSPEECH-2023-Papers
INTERSPEECH 2023 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
flash-attention
Fast and memory-efficient exact attention
chatgpt-on-wechat
基于大模型搭建的聊天机器人,同时支持 微信公众号、企业微信应用、飞书、钉钉 等接入,可选择GPT3.5/GPT-4o/GPT4.0/ Claude/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Claude/Kimi/LinkAI,能处理文本、语音和图片,访问操作系统和互联网,支持基于自有知识库进行定制企业智能客服。
torchcrepe
Pytorch implementation of the CREPE pitch tracker
AIGC-progress
Follow the rapid development of AIGC models and applications. | 跟上AIGC模型和应用快速发展的步伐 🚀
Speaker-Diarization
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
AcademiCodec
AcademiCodec: An Open Source Audio Codec Model for Academic Research
deep-speaker
Deep Speaker: an End-to-End Neural Speaker Embedding System.
Audio-Effects
Collection of audio effects plugins implemented from the explanations in the book "Audio Effects: Theory, Implementation and Application" by Joshua D. Reiss and Andrew P. McPherson.
3D-Speaker
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
speechbrain
A PyTorch-based Speech Toolkit
vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch
IntelNeuromorphicDNSChallenge
Intel Neuromorphic DNS Challenge
Large-Audio-Models
Keep track of big models in audio domain, including speech, singing, music etc.