zyser's repositories

MB-iSTFT-VITS2

Application of MB-iSTFT-VITS components to vits2_pytorch

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

RAVE

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Language:PythonLicense:NOASSERTIONStargazers:1Issues:1Issues:0

whisperX

WhisperX: Timestamp-Accurate Automatic Speech Recognition.

Language:PythonLicense:BSD-2-ClauseStargazers:1Issues:0Issues:0

Codec-SUPERB

Audio Codec Speech processing Universal PERformance Benchmark

Language:PythonStargazers:0Issues:0Issues:0

conditional-flow-matching

TorchCFM: a Conditional Flow Matching library

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

DDSP-SVC

End-to-end singing voice conversion system based on DDSP (Differentiable Digital Signal Processing)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Diff-HierVC

Official Pytorch Implementation of "Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation"

Language:PythonStargazers:0Issues:0Issues:0

edm2

Analyzing and Improving the Training Dynamics of Diffusion Models (EDM2)

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

FAcodec

Training code for FAcodec presented in NaturalSpeech3

Language:PythonStargazers:0Issues:0Issues:0

gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Grad-TTS

Implementation of the 'Grad-TTS' with Multilingual Cleaners

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

LLaMA-Factory

Unify Efficient Fine-tuning of 100+ LLMs

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

llama.cpp

Port of Facebook's LLaMA model in C/C++

Language:C++License:MITStargazers:0Issues:0Issues:0

llama2.c

Inference Llama 2 in one file of pure C

Language:CLicense:MITStargazers:0Issues:0Issues:0

megatts2

Unoffical implementation of Megatts2

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

ml-4m

4M: Massively Multimodal Masked Modeling

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

NeMo

NeMo: a toolkit for conversational AI

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

NeMo-text-processing

NeMo text processing for ASR and TTS

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

OpenPhonemizer

Permissively licensed, open sourced, local IPA Phonemizer (G2P) powered by deep learning.

Language:PythonLicense:BSD-3-Clause-ClearStargazers:0Issues:0Issues:0

Phi-3CookBook

This is a Phi-3 book for getting started with Phi-3. Phi-3, a family of open AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks.

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

PitchSqueezer

A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation

Language:PythonLicense:GPL-3.0Stargazers:0Issues:0Issues:0

stable-speech

Reproduction of Stability AI's Text-to-Speech model.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:Jupyter NotebookStargazers:0Issues:0Issues:0

supervoice-gpt

GPT-style network for phonemization with durations of text

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

tts-arabic-pytorch

TTS models for Arabic (Tacotron2, FastPitch)

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

utmos

A toolkit to calculate speech audio quality. Not affiliated with the original authors

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

vitsgpt-vits

the code for vits in the vitsGPT project

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

VoRAS_VC

VoRAS: Vocos Retrieval and self-Augmentation for Speech

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

WavCraft

Official repo for WavCraft, an AI agent for audio creation and editing

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0