Ewald Enzinger's repositories
alpaca-lora
Code for reproducing the Stanford Alpaca InstructLLaMA result on consumer hardware
faster-whisper
Faster Whisper transcription with CTranslate2
openai-whisper
Robust Speech Recognition via Large-Scale Weak Supervision
BetaVAE_VC
Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"
CAT
A CRF-based ASR Toolkit
D-TDNN
PyTorch implementation of Densely Connected Time Delay Neural Network
encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
fstalign
An efficient OpenFST-based tool for calculating WER and aligning two transcript sequences.
GPTQ-for-LLaMa
4 bits quantization of LLaMa using GPTQ
LLaMA-Adapter
LLaMA-Adapter: Tuning LLaMA within One Hour and 8M Parameters
llama-classification
Text classification with Foundation Language Model LLaMA
meilisearch-fastapi
Meilisearch integration with FastAPI
psr-calibration
Code for reproducing experiments of the paper.
pyctcdecode
A fast and lightweight python-based CTC beam search decoder for speech recognition.
samo
SAMO: SPEAKER ATTRACTOR MULTI-CENTER ONE-CLASS LEARNING FOR VOICE ANTI-SPOOFING
sequence_align
Efficient implementations of Needleman-Wunsch and other sequence alignment algorithms written in Rust with Python bindings via PyO3.
SnakeGAN
Please visit https://thuhcsi.github.io/SnakeGAN/
TriAAN-VC
TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion
vector-quantize-pytorch
Vector Quantization, in Pytorch
vosk-tts
Text To Speech Synthesis with Vosk
whisper-punctuator
Zero-shot Punctuation Insertion using Whisper
whisperX
WhisperX: Timestamp-Accurate Automatic Speech Recognition.