IMYBo

followers

following

stars

NWPU

China

Cyril Lv's starred repositories

ears_dataset

Expressive Anechoic Recordings of Speech (EARS)

Language:PythonNOASSERTION8800

brouhaha-vad

Predicts the level of noise and reverberation on your audiofiles

Language:Jupyter NotebookMIT12200

Awesome-Speaker-Diarization

Some comprehensive papers about speaker diarization

awesome-audio-plaza

Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation

MIT23500

S4M

Official implementation of Efficient Speech Separation Framework Based on Neural State-Space Models

Language:PythonMIT1600

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonApache-2.01085500

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonApache-2.02004700

NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Language:PythonApache-2.01082400

dover-lap

Python package for combining diarization system outputs.

Language:PythonMIT7400

VBx

Variational Bayes HMM over x-vectors diarization

Language:Python24200

s4

Structured state space sequence models

Language:Jupyter NotebookApache-2.0223500

S4ND-U-Net_speech_enhancement

Language:PythonApache-2.02700

aero

This repo contains the official PyTorch implementation of "Audio Super Resolution in the Spectral Domain" (ICASSP 2023)

Language:PythonMIT18800

awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

Apache-2.0150000

VB_diarization

VB Diarization with Eigenvoice and HMM Priors, refactored

Language:Python1500

descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Language:PythonMIT101200

NSD-MS2S

CHIME-7 diarization champion system: neural speaker diarization using memory-aware multi-speaker embedding with sequence-to-sequence architecture

Language:Shell5500

hifigan-denoiser

HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

Language:PythonApache-2.019700

MS-SNSD

The Microsoft Scalable Noisy Speech Dataset (MS-SNSD) is a noisy speech dataset that can scale to arbitrary sizes depending on the number of speakers, noise types, and Speech to Noise Ratio (SNR) levels desired.

Language:HTMLMIT45500

BAE-Net

BAE-NET: A LOW COMPLEXITY AND HIGH FIDELITY BANDWIDTH-ADAPTIVE NEURAL NETWORK FOR SPEECH SUPER-RESOLUTION

4400

NeXt_TDNN_ASV

Official repository of NeXt-TDNN for speaker verification

Language:Python4000

kmeans_pytorch

kmeans using PyTorch

Language:Jupyter NotebookMIT44900

WeChatMsg

提取微信聊天记录，将其导出成HTML、Word、Excel文档永久保存，对聊天记录进行分析生成年度聊天报告，用聊天数据训练专属于个人的AI聊天助手

Language:PythonGPL-3.03103100

nider

Python package to add text to images, textures and different backgrounds

Language:PythonMIT14900

Frame-by-frame-closed-form-update-for-mask-based-adaptive-MVDR-beamforming

speech-enhacement

Language:Python4400

unfoldNd

(N=1,2,3)-dimensional unfold (im2col) and fold (col2im) in PyTorch

Language:PythonMIT8000

Yi

A series of large language models trained from scratch by developers @01-ai

Language:PythonApache-2.0742700

EEND_dataprep

Language:Shell4300

BlueLM

BlueLM(蓝心大模型): Open large language models developed by vivo AI Lab

Language:PythonNOASSERTION81000

UniAudio

The official source code of UniAudio

Language:Python7600