splinter21's repositories
BcutASR
必剪的语音识别逆向api
codec-bpe
Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs
contentvec
speech self-supervised representations
ControlNetPlus
ControlNet++: All-in-one ControlNet for image generations and editing!
e2-tts-pytorch
Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch
FasterLivePortrait
Bring portraits to life in Real Time!onnx/tensorrt support!
fftw3
DO NOT CHECK OUT THESE FILES FROM GITHUB UNLESS YOU KNOW WHAT YOU ARE DOING. (See below.)
inferStreamHiFiGAN
StreamHiFiGAN offers a HiFiGAN vocoder model optimized for streaming inference, providing real-time audio synthesis capabilities.
Kolors
Kolors Team
models
All my self trained & released AI upscaling models. After gathering and applying over 600 different upscaling models, I learned how to train my own models, and these are the results.
Music-Source-Separation-Training
Repository for training models for music source separation.
noise-reduction
noise reduction
Paints-UNDO
Understand Human Behavior to Align True Needs
phrex
Phrex is a PyTorch model for inferring speaker-independent embeddings and pitch from speech audio spectrograms
promonet
Prosody and Pronunciation Modification Network
SenseVoice-python
sensevoice with onnx runtime
SOFA
SOFA: Singing-Oriented Forced Aligner
SpeechDenoiser
SpeechDenoiser: Real-Time Speech Denoising with ONNX Welcome to SpeechDenoiser, a simple and effective solution for real-time speech denoising using an ONNX model. This repository contains everything you need to get started with enhancing audio quality by reducing noise, making it perfect for improving voice recordings and live communication.
split-lang
✨ Split text by language (i18n) powered by wtpsplit and langdetect (fasttext) [e.g. 你喜欢看アニメ吗 -> 你喜欢看 | アニメ | 吗]
vampnet
music generation with masked transformers!
vs_deepdeinterlace
AI Deinterlacing functions for Vapoursynth
wetext
Python runtime for WeTextProcessing (does not depend on Pynini)
YOLO-Stutter
YOLO-Stutter: End-to-end Region-Wise Speech Dysfluency Detection