Ewald Enzinger (entn-at)

entn-at

Geek Repo

Location:Portland, Oregon

Home Page:https://entn.at/

Twitter:@entn_at

Github PK Tool:Github PK Tool

Ewald Enzinger's repositories

Language:PythonStargazers:3Issues:0Issues:0

ATST-SED

This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".

Language:Jupyter NotebookLicense:MITStargazers:1Issues:0Issues:0

OpenVoice

Instant voice cloning

Language:PythonLicense:NOASSERTIONStargazers:1Issues:0Issues:0

pflowtts_pytorch

Unofficial implementation of NVIDIA P-Flow TTS paper

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

TransformersSpeechAligner

Long speech to text alignment based on Huggingface Transformers.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1Issues:0Issues:0

agc

Audiogen Codec

License:MITStargazers:0Issues:0Issues:0

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

audiossl

A library built for easier audio self-supervised training, downstream tasks evaluation

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:1Issues:0

control-vc

This is the implementation for "ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Rhythm"

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

DCA-PLDA

Discriminative Condition-Aware PLDA

Language:PythonLicense:NOASSERTIONStargazers:0Issues:1Issues:0

distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

flutter_onnx

ONNX runtime plugin for Flutter

Language:C++License:NOASSERTIONStargazers:0Issues:0Issues:0

flutter_sherpa_onnx

Flutter plugin wrapping the Sherpa-ONNX runtime

Language:DartLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

last

A JAX library for building lattice-based speech transducer models

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

OverFlow

Probabilistic speech syntheses by mixing neural HMM TTS with normalising flows

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

pyannote-audio_overlapped-speech-detection_cpp

C++ version of pyannote audio overlapped speech detection pipeline

License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

stable-ts

Timestamping Spoken Words

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

Triton-Puzzles

Puzzles for learning Triton

License:Apache-2.0Stargazers:0Issues:0Issues:0

tts-scores

Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

utut

Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation

Stargazers:0Issues:0Issues:0

VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

valle

Zero-Shot Text-To-Speech

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

WhisperKit

Swift native on-device speech recognition with Whisper for Apple Silicon

License:MITStargazers:0Issues:0Issues:0

whisperkittools

Python tools for WhisperKit: Model conversion, optimization and evaluation

License:MITStargazers:0Issues:0Issues:0