Beast code in Giters

Qoboty's repositories

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonMIT000

Bert-VITS2-ext

基于Bert-VITS2做的表情、动画测试

Language:PythonAGPL-3.0000

best-rq-pytorch

Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.

Language:PythonMIT000

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Apache-2.0000

clash

A rule-based tunnel in Go.

Language:GoGPL-3.0000

CosyVoice

LLM based TTS model, providing inference/training/deployment full-stack ability.

Apache-2.0000

fish-speech

Brand new TTS solution

NOASSERTION000

HierSpeechpp

The official implementation of HierSpeech++

Language:PythonMIT000

Inpaint-Anything

Inpaint anything using Segment Anything and inpainting models.

Apache-2.0000

llark

Code for the paper "LLark: A Multimodal Foundation Model for Music" by Josh Gardner, Simon Durand, Daniel Stoller, and Rachel Bittner.

Language:PythonNOASSERTION000

ltu

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".

Language:Python000

magic-animate

MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

Language:PythonBSD-3-Clause000

math-lm

Language:PythonMIT000

MetaMath

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Language:PythonApache-2.0000

musicfm

Language:PythonNOASSERTION000

MyHeyGen

Language:Python000

OpenVoice

Instant voice cloning by MyShell

Language:PythonNOASSERTION000

parler-tts

Inference and training library for high-quality TTS models.

Language:PythonApache-2.0000

SoundStorm

The reproduced code for Google's SoundStorm

Language:Python000

stable-audio-tools

Generative models for conditional audio generation

Language:PythonMIT000

StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Language:PythonMIT000

TTS-xtts

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonMPL-2.0000

UMOE-Scaling-Unified-Multimodal-LLMs

The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"

Language:Python000

UniAudio

The Open Source Code of UniAudio

Language:Python000

UniCATS-CTX-txt2vec

CTX-txt2vec, the acoustic model in UniCATS

Language:Python000

UniCATS-CTX-vec2wav

Code for CTX-vec2wav in UniCATS

Language:Python000

Video-LLaVA

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Language:PythonApache-2.0000

vocode-python

🤖 Build voice-based LLM agents. Modular + open source.

Language:PythonMIT000

VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Language:Jupyter NotebookNOASSERTION000

VoiceFlow-TTS

Language:Python000