xzm2004 (xzm2004260)

xzm2004260

Geek Repo

Location:Xiamen

Github PK Tool:Github PK Tool

xzm2004's repositories

awesome-music-informatics

A curated list of awesome article, tutorial, library, webpage, etc.

Stargazers:1Issues:0Issues:0

DNS-Challenge

This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.

Language:PythonLicense:CC-BY-4.0Stargazers:1Issues:0Issues:0

FunCodec

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

License:MITStargazers:1Issues:0Issues:0

agc

Audiogen Codec

License:MITStargazers:0Issues:0Issues:0

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

License:MITStargazers:0Issues:0Issues:0

audioFlux

A library for audio and music analysis, feature extraction.

License:MITStargazers:0Issues:0Issues:0

audioseal

Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector

License:MITStargazers:0Issues:0Issues:0

audiowmark

Audio Watermarking

License:GPL-3.0Stargazers:0Issues:0Issues:0

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

License:Apache-2.0Stargazers:0Issues:0Issues:0

descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

License:MITStargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

DiJiang

The official implementation of "DiJiang: Efficient Large Language Models through Compact Kernelization"

Stargazers:0Issues:0Issues:0

IMS-Toucan

Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

License:Apache-2.0Stargazers:0Issues:0Issues:0

megatts2

Unoffical implementation of Megatts2

License:MITStargazers:0Issues:0Issues:0

metavoice-src

Foundational model for human-like, expressive TTS

License:Apache-2.0Stargazers:0Issues:0Issues:0

muzic

Muzic: Music Understanding and Generation with Artificial Intelligence

License:MITStargazers:0Issues:0Issues:0

open-musiclm

Implementation of MusicLM, a text to music model published by Google Research, with a few modifications.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

parler-tts

Inference and training library for high-quality TTS models.

License:Apache-2.0Stargazers:0Issues:0Issues:0

RTNeural

Real-time neural network inferencing

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

License:NOASSERTIONStargazers:0Issues:0Issues:0

so-vits-svc-fork

so-vits-svc fork with realtime support, improved interface and more features.

License:NOASSERTIONStargazers:0Issues:0Issues:0

sparse-vqvae

Experimental implementation for a sparse-dictionary based version of the VQ-VAE2 paper

License:NOASSERTIONStargazers:0Issues:0Issues:0

Speech-Editing-Toolkit

It's a repository for implementations of neural speech editing algorithms.

Stargazers:0Issues:0Issues:0

StableTTS

Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3

License:MITStargazers:0Issues:0Issues:0

supervoice-gpt

GPT-style network for phonemization with durations of text

Stargazers:0Issues:0Issues:0

ttts

Train the next generation of TTS systems.

License:MPL-2.0Stargazers:0Issues:0Issues:0

USLM

Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"

Stargazers:0Issues:0Issues:0

vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

License:Apache-2.0Stargazers:0Issues:0Issues:0

VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

License:NOASSERTIONStargazers:0Issues:0Issues:0

wavmark

AI-based Audio Watermarking Tool

License:MITStargazers:0Issues:0Issues:0