Beast code in Giters

yxfy's repositories

SenseVoice

Multilingual Voice Understanding Model

Language:PythonNOASSERTION000

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

NOASSERTION000

CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Apache-2.0000

MARS5-TTS

MARS5 speech model (TTS) from CAMB.AI

AGPL-3.0000

ChatTTS

ChatTTS is a generative speech model for daily dialogue.

NOASSERTION000

This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the same time, this project also supports MelSpectrogram, Spectrogram data preprocessing methods

Apache-2.0000

bark

🔊 Text-Prompted Generative Audio Model

MIT000

LLaSM

第一个支持中英文双语语音-文本多模态对话的开源可商用对话模型。便捷的语音输入将大幅改善以文本为输入的大模型的使用体验，同时避免了基于 ASR 解决方案的繁琐流程以及可能引入的错误。

Apache-2.0000

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Apache-2.0000

ChatGLM-6B

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Apache-2.0000

speech

000

AuxFormer

AuxFormer: Robust Approach to Audiovisual Emotion Recognition

MIT000

SoundLabel

语音数据集制作标记工具

000

PaddleSpeech

Easy-to-use Speech Toolkit including SOTA/Streaming ASR with punctuation, influential TTS with text frontend, Speaker Verification System and End-to-End Speech Simultaneous Translation.

Apache-2.0000

espnet

End-to-End Speech Processing Toolkit

Apache-2.0000

AdaSpeech

An implementation of Microsoft's "AdaSpeech: Adaptive Text to Speech for Custom Voice"

000

MOSNettf

Implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"

NOASSERTION000

MOSNet-pytorch

The pytorch implement of MOSNet

NOASSERTION000

Robust_Fine_Grained_Prosody_Control

PyTorch Implementation of Robust and fine-grained prosody control of end-to-end speech synthesis

BSD-3-Clause000

Multimodal-Emotion-Recognition

This repository contains the code for the paper `End-to-End Multimodal Emotion Recognition using Deep Neural Networks`.

BSD-3-Clause000

FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

MIT000

Information-Extraction-Chinese

Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取

000

FastSpeech

The Implementation of FastSpeech based on pytorch.

000

TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, Korean, Chinese)

Apache-2.0000

yxfy

yxfy's repositories

TeleSpeech-ASR

SenseVoice

FunASR

CosyVoice

MARS5-TTS

ChatTTS

seed-tts-eval

NeuralSpeech

VoiceprintRecognition-Pytorch

bark

LLaSM

wenet

ChatGLM-6B

speech

AuxFormer

SoundLabel

PaddleSpeech

espnet

AdaSpeech

MOSNettf

MOSNet-pytorch

Robust_Fine_Grained_Prosody_Control

Multimodal-Emotion-Recognition

FastSpeech2

Information-Extraction-Chinese

FastSpeech

TensorFlowTTS