shaun's repositories

audio-diffusion

Apply Denoising Diffusion Probabilistic Models using the new Hugging Face diffusers package to synthesize music instead of images.

Language:Jupyter NotebookLicense:GPL-3.0Stargazers:1Issues:0Issues:0

OpenVoice

Instant voice cloning by MyShell

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

VITS_gpt_llama

the code for llama in the vitsGPT project

Language:PythonLicense:NOASSERTIONStargazers:1Issues:0Issues:0

voicefixer_main

General Speech Restoration

Language:PythonLicense:MITStargazers:1Issues:1Issues:0

Bert-VITS2

vits2 backbone with bert

Language:PythonLicense:AGPL-3.0Stargazers:0Issues:0Issues:0

Adan

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

diffsptk

A differential version of SPTK

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

FAcodec

Training code for FAcodec presented in NaturalSpeech3

Language:PythonStargazers:0Issues:0Issues:0

fregrad

Code repository for FreGrad

Stargazers:0Issues:0Issues:0

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

larynx2_vits_TTS_cpp_onnx

A fast, local neural text to speech system

Language:C++License:MITStargazers:0Issues:0Issues:0

llama3

The official Meta Llama 3 GitHub site

License:NOASSERTIONStargazers:0Issues:0Issues:0

llm.c

LLM training in simple, raw C/CUDA

License:MITStargazers:0Issues:0Issues:0

metavoice-src

AI for human-level speech intelligence

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

MiniGemini

Official implementation for Mini-Gemini

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

NeMo

NeMo: a toolkit for conversational AI

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

open-unmix-pytorch

Open-Unmix - Music Source Separation for PyTorch

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

phase_augmentation_one_to_many

Submitted to ICASSP 2023

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:0Issues:0Issues:0

sgmse_Speech-Enhancement-and-Dereverberation-with-Diffusion-based-Generative-Models

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

snake

SNAKE Inspired by "Neural Networks Fail to Learn Periodic Functions and How to Fix It"

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

stable-audio-tools

Generative models for conditional audio generation

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

stable-speech

Reproduction of Stability AI's Text-to-Speech model.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

storm

StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

vector-quantize-pytorch

Vector Quantization, in Pytorch

Language:PythonLicense:MITStargazers:0Issues:2Issues:0

visqol

Perceptual Quality Estimator for speech and audio

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

vitsgpt-vits

the code for vits in the vitsGPT project

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:0Issues:0Issues:0

WavCraft

Official repo for WavCraft, an AI agent for audio creation and editing

License:NOASSERTIONStargazers:0Issues:0Issues:0