Yeongtae

Yeongtae

Geek Repo

Company:@neosapience

Location:South Korea

Home Page:https://www.linkedin.com/in/yeongtae-hwang-3a40b0163/

Github PK Tool:Github PK Tool

Yeongtae's starred repositories

SenseVoice

Multilingual Voice Understanding Model

Language:PythonLicense:NOASSERTIONStargazers:1857Issues:0Issues:0

matmulfreellm

Implementation for MatMul-free LM.

Language:PythonLicense:Apache-2.0Stargazers:2782Issues:0Issues:0

Speech2RIR

This is the official implementation of reverberant speech to room impulse response estimator

Language:PythonLicense:NOASSERTIONStargazers:8Issues:0Issues:0

BiRR

Binaural Room Reverb

Language:CLicense:LGPL-3.0Stargazers:13Issues:0Issues:0

uroman

Universal Romanizer that can convert any unicode script to roman (latin) script

Language:PerlLicense:NOASSERTIONStargazers:134Issues:0Issues:0
Language:PythonStargazers:4Issues:0Issues:0

Diff-HierVC

Official Pytorch Implementation of "Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation"

Language:PythonStargazers:180Issues:0Issues:0

TTS-arxiv-daily

Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)

Language:PythonLicense:Apache-2.0Stargazers:162Issues:0Issues:0

AI-For-Beginners

12 Weeks, 24 Lessons, AI for All!

Language:Jupyter NotebookLicense:MITStargazers:33516Issues:0Issues:0
Language:HTMLStargazers:37Issues:0Issues:0
Language:PythonStargazers:231Issues:0Issues:0

pyannote-metrics

A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems

Language:PythonLicense:MITStargazers:182Issues:0Issues:0

speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

Stargazers:527Issues:0Issues:0

voxconverse

Spot the conversation: speaker diarisation in the wild

Stargazers:115Issues:0Issues:0

VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:7294Issues:0Issues:0

champ

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

Language:PythonLicense:MITStargazers:3519Issues:0Issues:0

TTSDatasetRecorder

A simple app for recording speech datasets.

Language:PythonStargazers:25Issues:0Issues:0

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonLicense:MITStargazers:11074Issues:0Issues:0

yt-dlp

A feature-rich command-line audio/video downloader

Language:PythonLicense:UnlicenseStargazers:78510Issues:0Issues:0

dust3r

DUSt3R: Geometric 3D Vision Made Easy

Language:PythonLicense:NOASSERTIONStargazers:4806Issues:0Issues:0

metavoice-src

Foundational model for human-like, expressive TTS

Language:PythonLicense:Apache-2.0Stargazers:3593Issues:0Issues:0

DDDM-VC

Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion" (AAAI 2024)

Language:PythonStargazers:157Issues:0Issues:0

ar-vits

text to speech using autoregressive transformer and VITS

Language:PythonLicense:MITStargazers:216Issues:0Issues:0

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:30189Issues:0Issues:0

AnyText

Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>

Language:PythonLicense:Apache-2.0Stargazers:4088Issues:0Issues:0

TinySAM

Official PyTorch implementation of "TinySAM: Pushing the Envelope for Efficient Segment Anything Model"

Language:PythonLicense:Apache-2.0Stargazers:381Issues:0Issues:0

clone-voice

A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具,使用你的音色或任意声音来录制音频

Language:PythonLicense:NOASSERTIONStargazers:6862Issues:0Issues:0

resemble-enhance

AI powered speech denoising and enhancement

Language:PythonLicense:MITStargazers:1152Issues:0Issues:0

OpenVoice

Instant voice cloning by MyShell.

Language:PythonLicense:MITStargazers:27754Issues:0Issues:0

Automatic-Prosody-Annotator-with-SSWP-CLAP

An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).

Language:PythonLicense:Apache-2.0Stargazers:42Issues:0Issues:0