Aby Louw's repositories
APNet2
Source code of APNet2, a vocoder
audioseal
Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector
audiowmark
Audio Watermarking
ConsistencyVC-voive-conversion
Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion
DeepPhonemizer
Grapheme to phoneme conversion with deep learning.
dhasa2023_styleguide
Style guide for the Digital Humanities Association of Southern Africa (DHASA) fourth conference, 2023.
dectalk
Modern builds for the 90s/00s DECtalk text-to-speech application.
descript-audio-vae
VAE GAN modified from Descript Audio Codec, which replaces the RVQ with VAE
DiscreteSpeechMetrics
Reference-aware automatic speech evaluation toolkit
flet
Flet enables developers to easily build realtime web, mobile and desktop apps in Python. No frontend experience required.
hubert
HuBERT content encoders for: A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion
istftnet
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform
licensecc
Software licensing, copy protection in C++. It has few dependencies and it's cross-platform.
Matcha-TTS
🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
MB-iSTFT-VITS2
Application of MB-iSTFT-VITS components to vits2_pytorch
MockingBird
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
onnx-simplifier
Simplify your onnx model
pflowtts_pytorch
Unofficial implementation of NVIDIA P-Flow TTS paper
phonepiece
phone inventory library
pytorch-fid
Compute FID scores with PyTorch.
QuickVC-VoiceConversion
QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
Real3DPortrait
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis; ICLR 2024 Spotlight; Official code
Sunsynk-Home-Assistant-Power-Flow-Card
A simple card to emulate the Sunsynk power flow thats show on the Inverter
UniCATS-CTX-vec2wav
Code for CTX-vec2wav in UniCATS
vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
VoiceFlow-TTS
This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
wavmark
AI-based Audio Watermarking Tool
yaml-ui-editor
YAML UI editor application with Git repository storage
ZEST
Zero-Shot Emotion Style Transfer