JinmingChe's repositories

attention_keras

Keras Layer implementation of Attention for Sequential models

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

AttentionIsOFFByOne

Implementation of "Attention Is Off By One" by Evan Miller

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

auto_avsr

Auto-AVSR: Lip-Reading Sentences Project

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

chinese_speech_pretrain

chinese speech pretrained models

Stargazers:0Issues:0Issues:0

CIF-HieraDist

[INTERSPEECH 2023] Knowledge Transfer from Pre-trained Language Models to Cif-based Recognizers via Hierarchical Distillation

License:Apache-2.0Stargazers:0Issues:0Issues:0

ColossalAI

Making big AI models cheaper, easier, and scalable

License:Apache-2.0Stargazers:0Issues:0Issues:0

Comprehensive-Transformer-TTS

A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS

License:MITStargazers:0Issues:0Issues:0

DARCN

The implementation of "A Recursive Network with Dynamic Attention for Monaural Speech Enhancement"

Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

FFmpeg

Mirror of https://git.ffmpeg.org/ffmpeg.git

License:NOASSERTIONStargazers:0Issues:0Issues:0

FullSubNet-plus

The official PyTorch implementation of "FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement".

License:Apache-2.0Stargazers:0Issues:0Issues:0

FunASR

A Fundamental End-to-End Speech Recognition Toolkit

License:NOASSERTIONStargazers:0Issues:0Issues:0

generative-ai-roadmap

生成式AI的应用路线图 The roadmap of generative AI: use cases and applications

License:CC-BY-4.0Stargazers:0Issues:0Issues:0

GenericTools

I put here tools that I use in different projects all the time, so I have them all centralized

Stargazers:0Issues:0Issues:0

jieba

结巴中文分词

License:MITStargazers:0Issues:0Issues:0

Leveraging-Self-Supervised-Learning-for-AVSR

Official PyTorch implementation of paper Leveraging Unimodal Self Supervised Learning for Multimodal Audio-Visual Speech Recognition

License:MITStargazers:0Issues:0Issues:0

LPCNet

Efficient neural speech synthesis

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

MTFAA-Net

Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement

Stargazers:0Issues:0Issues:0

PerceptualAudio

Perceptual Metrics of Audio - perceptually relevant loss function. DPAM and CDPAM

License:MITStargazers:0Issues:0Issues:0

pytorch-metric-learning

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

License:MITStargazers:0Issues:0Issues:0

so-vits-svc

SoftVC VITS Singing Voice Conversion

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

so-vits-svc-5.0

Core Engine of Singing Voice Conversion & Singing Voice Clone

License:MITStargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

TIM-Net_SER

[ICASSP 2023] Official Tensorflow implementation of "Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition".

License:GPL-3.0Stargazers:0Issues:0Issues:0

VITS-fast-fine-tuning

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion

License:Apache-2.0Stargazers:0Issues:0Issues:0

vits_chinese_0829

Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support streaming out!

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

License:MITStargazers:0Issues:0Issues:0

Whisper-Finetune

微调Whisper语音识别模型,支持无时间戳数据训练,有时间戳数据训练、无语音数据训练。加速推理,支持Web部署、Windows桌面部署和Android部署

License:Apache-2.0Stargazers:0Issues:0Issues:0