Beast code in Giters

JinmingChe's repositories

attention_keras

Keras Layer implementation of Attention for Sequential models

Language:PythonMIT000

AttentionIsOFFByOne

Implementation of "Attention Is Off By One" by Evan Miller

Language:PythonMIT000

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonMIT000

auto_avsr

Auto-AVSR: Lip-Reading Sentences Project

Language:PythonApache-2.0000

chinese_speech_pretrain

chinese speech pretrained models

000

CIF-HieraDist

[INTERSPEECH 2023] Knowledge Transfer from Pre-trained Language Models to Cif-based Recognizers via Hierarchical Distillation

Apache-2.0000

ColossalAI

Making big AI models cheaper, easier, and scalable

Apache-2.0000

Comprehensive-Transformer-TTS

A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS

MIT000

DARCN

The implementation of "A Recursive Network with Dynamic Attention for Monaural Speech Enhancement"

000

dparn

Language:Python000

FFmpeg

Mirror of https://git.ffmpeg.org/ffmpeg.git

NOASSERTION000

FullSubNet-plus

The official PyTorch implementation of "FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement".

Apache-2.0000

FunASR

A Fundamental End-to-End Speech Recognition Toolkit

NOASSERTION000

generative-ai-roadmap

生成式AI的应用路线图 The roadmap of generative AI: use cases and applications

CC-BY-4.0000

GenericTools

I put here tools that I use in different projects all the time, so I have them all centralized

000

jieba

结巴中文分词

MIT000

Leveraging-Self-Supervised-Learning-for-AVSR

Official PyTorch implementation of paper Leveraging Unimodal Self Supervised Learning for Multimodal Audio-Visual Speech Recognition

MIT000

LPCNet

Efficient neural speech synthesis

BSD-3-Clause000

MTFAA-Net

Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement

000

PerceptualAudio

Perceptual Metrics of Audio - perceptually relevant loss function. DPAM and CDPAM

MIT000

pytorch-metric-learning

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

MIT000

so-vits-svc

SoftVC VITS Singing Voice Conversion

BSD-3-Clause000

so-vits-svc-5.0

Core Engine of Singing Voice Conversion & Singing Voice Clone

MIT000

sound-separation

Apache-2.0000

TIM-Net_SER

[ICASSP 2023] Official Tensorflow implementation of "Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition".

GPL-3.0000

VITS-fast-fine-tuning

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion

Apache-2.0000

vits_chinese_0829

Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support streaming out!

Language:PythonMIT000

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:C++Apache-2.0000

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

MIT000

Whisper-Finetune

微调Whisper语音识别模型，支持无时间戳数据训练，有时间戳数据训练、无语音数据训练。加速推理，支持Web部署、Windows桌面部署和Android部署

Apache-2.0000