macroustc

macroustc

Geek Repo

Github PK Tool:Github PK Tool

macroustc's repositories

OpenVoice

Instant voice cloning

Language:PythonLicense:NOASSERTIONStargazers:1Issues:0Issues:0

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

audino

Open source audio annotation tool for humans

Language:JavaScriptLicense:MITStargazers:0Issues:0Issues:0

Awesome-Talking-Face

📖 A curated list of resources dedicated to talking face.

License:MITStargazers:0Issues:0Issues:0

Awesome-Text-to-Image

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

License:MITStargazers:0Issues:0Issues:0

Awesome-Video-Diffusion-Models

[Arxiv] A Survey on Video Diffusion Models

Stargazers:0Issues:0Issues:0

Bert-VITS2

vits2 backbone with bert

Language:PythonLicense:AGPL-3.0Stargazers:0Issues:0Issues:0

ChatTTS

ChatTTS is a generative speech model for daily dialogue.

License:NOASSERTIONStargazers:0Issues:0Issues:0

DeepLearningSystem

Deep Learning System core principles introduction.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Diff-Foley

Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models

License:Apache-2.0Stargazers:0Issues:0Issues:0

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

License:Apache-2.0Stargazers:0Issues:0Issues:0

EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

fish-speech

Brand new TTS solution

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

jepa

PyTorch code and models for V-JEPA self-supervised learning from video.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

llm-paper-daily

Daily updated LLM papers. 每日更新 LLM 相关的论文,欢迎订阅 👏 喜欢的话动动你的小手 🌟 一个

Stargazers:0Issues:0Issues:0

minisora

The Mini Sora project aims to explore the implementation path and future development direction of Sora.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Open-Sora

Building your own video generation model like OpenAI's Sora

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Open-Sora-Plan

This project aim to reproducing Sora (Open AI T2V model), but we only have limited resource. We deeply wish the all open source community can contribute to this project.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:0Issues:0Issues:0

phonemizer

Simple text to phones converter for multiple languages

License:GPL-3.0Stargazers:0Issues:0Issues:0

piper

A fast, local neural text to speech system

Language:C++License:MITStargazers:0Issues:0Issues:0

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:0Issues:0Issues:0

SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model

License:MITStargazers:0Issues:0Issues:0

StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

UniAudio

The Open Source Code of UniAudio

Language:PythonStargazers:0Issues:0Issues:0

VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

License:NOASSERTIONStargazers:0Issues:0Issues:0

yt-dlp

A feature-rich command-line audio/video downloader

License:UnlicenseStargazers:0Issues:0Issues:0