redust (andyye1999)

andyye1999

Geek Repo

Company:Dalian University of Technology

Home Page:https://andyye1999.github.io

Github PK Tool:Github PK Tool

redust's starred repositories

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language:PythonLicense:NOASSERTIONStargazers:5509Issues:0Issues:0

ChatTTS

A generative speech model for daily dialogue.

Language:PythonLicense:AGPL-3.0Stargazers:29306Issues:0Issues:0

awesome-large-audio-models

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

Stargazers:502Issues:0Issues:0

EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Language:PythonLicense:Apache-2.0Stargazers:7067Issues:0Issues:0
Language:PythonLicense:MITStargazers:30Issues:0Issues:0

Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonStargazers:869Issues:0Issues:0

SemantiCodec-inference

Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.

Language:PythonLicense:MITStargazers:103Issues:0Issues:0

SenseVoice

Multilingual Voice Understanding Model

Language:PythonLicense:NOASSERTIONStargazers:2114Issues:0Issues:0

CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Language:PythonLicense:Apache-2.0Stargazers:3917Issues:0Issues:0

encodecmae

Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'

Language:PythonStargazers:79Issues:0Issues:0

gtcrn

The official implementation of GTCRN, an ultra-lite speech enhancement model.

Language:PythonLicense:MITStargazers:160Issues:0Issues:0

TRT-SE

An example of a speech enhancement model deployed with TensorRT.

Language:PythonStargazers:27Issues:0Issues:0

BAE-Net

BAE-NET: A LOW COMPLEXITY AND HIGH FIDELITY BANDWIDTH-ADAPTIVE NEURAL NETWORK FOR SPEECH SUPER-RESOLUTION

Language:PythonStargazers:51Issues:0Issues:0

speechmetrics

A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR

Language:PythonLicense:MITStargazers:886Issues:0Issues:0

emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Language:PythonStargazers:551Issues:0Issues:0

voice_datasets

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

Stargazers:1644Issues:0Issues:0

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:1348Issues:0Issues:0

dual-path-RNNs-DPRNNs-based-speech-separation

A PyTorch implementation of dual-path RNNs (DPRNNs) based speech separation described in "Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation".

Language:PythonStargazers:165Issues:0Issues:0

Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Language:PythonLicense:Apache-2.0Stargazers:13019Issues:0Issues:0

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:10664Issues:0Issues:0

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:4378Issues:0Issues:0

UltraDualPathCompression

A Pytorch-based implementation of the compression and decompression module in "Ultra Dual-Path Compression For Joint Echo Cancellation And Noise Suppression".

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:35Issues:0Issues:0

SALMONN

SALMONN: Speech Audio Language Music Open Neural Network

Language:PythonLicense:Apache-2.0Stargazers:949Issues:0Issues:0

FunCodec

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

Language:PythonLicense:MITStargazers:334Issues:0Issues:0

AudioCodec-Hub

AudioCodec-Hub is a Python library for encoding and decoding audio data, supporting various neural audio codec models

Language:PythonLicense:MITStargazers:20Issues:0Issues:0

awesome-speech-recognition-speech-synthesis-papers

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)

License:MITStargazers:2931Issues:0Issues:0

coder2gwy

互联网首份程序员考公指南,由3位已经进入体制内的前大厂程序员联合献上。

Stargazers:26146Issues:0Issues:0

SFANC-Window

Real-time Implementation of CNN-based selective fixed-filter active noise control and effectiveness analysis using explainable AI

Language:PythonStargazers:15Issues:0Issues:0

awesome-python-scientific-audio

Curated list of python software and packages related to scientific research in audio

Stargazers:1539Issues:0Issues:0