shaun95

zyser's repositories

spear-tts-pytorch

An unofficial PyTorch implementation of SPEAR-TTS.

Language:Jupyter NotebookMIT100

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.

Language:HTMLAGPL-3.0000

Bert-VITS2

vits2 backbone with bert

Language:PythonAGPL-3.0000

agent-attention-pytorch

Implementation of Agent Attention in Pytorch

MIT000

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonMIT000

AudioDec

An Open-source Streaming High-fidelity Neural Audio Codec

Language:PythonNOASSERTION000

BigVGAN-NVIDIA

Official implementation of BigVGAN in PyTorch

Language:PythonMIT000

diffsptk

A differential version of SPTK

Language:PythonApache-2.0000

e2-tts-pytorch

Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch

Language:PythonMIT000

fairseq_meta_fork

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Language:PythonMIT010

flash-attention

Language:PythonBSD-3-Clause000

gmm-torch

Gaussian mixture models in PyTorch.

Language:PythonMIT000

golf

A DDSP-based neural vocoder.

Language:Jupyter NotebookMIT000

HierSpeechpp_zero_shot_vc

The official implementation of HierSpeech++

Language:PythonMIT000

local-attention

An implementation of local windowed attention for language modeling

Language:PythonMIT000

LPCNet

Efficient neural speech synthesis

Language:CBSD-3-Clause030

mamba

Language:PythonApache-2.0000

metavoice-src

AI for human-level speech intelligence

Language:PythonApache-2.0000

phonemizer

Simple text to phones converter for multiple languages

Language:PythonGPL-3.0000

ring-attention-pytorch

Explorations into Ring Attention, from Liu et al. at Berkeley AI

Language:PythonMIT000

SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model

Language:PythonMIT000

soundata

Python library for downloading, loading & working with sound datasets

Language:PythonBSD-3-Clause000

SpeechTokenizer

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

Language:PythonApache-2.0000