edwincheong's starred repositories
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
fish-speech
Brand new TTS solution
EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Depth-Anything
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
gemma_pytorch
The official PyTorch implementation of Google's Gemma models
WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
WhisperKit
On-device Speech Recognition for Apple Silicon
HierSpeechpp
The official implementation of HierSpeech++
Conv-TasNet
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement
Awesome-Document-Image-Rectification
A comprehensive list of awesome document image rectification papers.
VoiceFlow-TTS
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
OpenPhonemizer
An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPL phonemizer.
appjsonify
A handy PDF-to-JSON conversion tool for academic papers implemented in Python.