Ewald Enzinger (entn-at)

entn-at

Geek Repo

Location:Portland, Oregon

Home Page:https://entn.at/

Twitter:@entn_at

Github PK Tool:Github PK Tool

Ewald Enzinger's repositories

Language:PythonStargazers:3Issues:1Issues:0

DDDM-VC

Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion" (AAAI 2024)

Stargazers:1Issues:0Issues:0

OpenVoice

Instant voice cloning

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

StableTTS

Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3

License:MITStargazers:1Issues:0Issues:0

StreamVC

An unofficial pytorch implementation of "STREAMVC: REAL-TIME LOW-LATENCY VOICE CONVERSION".

License:MITStargazers:1Issues:0Issues:0

agc

Audiogen Codec

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

DCA-PLDA

Discriminative Condition-Aware PLDA

Language:PythonLicense:NOASSERTIONStargazers:0Issues:1Issues:0

distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

FreeV

[InterSpeech 24] FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter

License:MITStargazers:0Issues:0Issues:0

gazelle-train

Joint speech-language model - respond directly to audio!

License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

istft-onnx

Export an ONNX graph that performs ISTFT. Designed for TTS models.

Stargazers:0Issues:0Issues:0

languagecodec

Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models

License:MITStargazers:0Issues:0Issues:0

pyannote-audio_overlapped-speech-detection_cpp

C++ version of pyannote audio overlapped speech detection pipeline

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Stargazers:0Issues:0Issues:0

rustfst

Rust library for Weighted Finite States Transducers as decribed by Mohri and Allauzen

Language:RustLicense:NOASSERTIONStargazers:0Issues:2Issues:0
Stargazers:0Issues:0Issues:0

Toroidal-PSDA

A probabilistic scoring backend for length-normalized embeddings.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Train_Hifigan_XTTS

This is an implementation for train hifigan part of XTTSv2 model using Coqui/TTS.

Stargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Triton-Puzzles

Puzzles for learning Triton

License:Apache-2.0Stargazers:0Issues:0Issues:0

tts-scores

Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

License:NOASSERTIONStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

voxangeles

VoxAngeles Corpus

Language:PraatStargazers:0Issues:0Issues:0

WhisperKit

Swift native on-device speech recognition with Whisper for Apple Silicon

License:MITStargazers:0Issues:0Issues:0

whisperkittools

Python tools for WhisperKit: Model conversion, optimization and evaluation

Language:PythonLicense:MITStargazers:0Issues:0Issues:0