Young Han Lee's starred repositories

PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Language:PythonLicense:AGPL-3.0Stargazers:2564Issues:0Issues:0

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:18275Issues:0Issues:0

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonLicense:MITStargazers:20262Issues:0Issues:0
Language:PythonStargazers:7Issues:0Issues:0

lvc-vc

End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions

Language:PythonLicense:MITStargazers:81Issues:0Issues:0

korean-romanizer

A Python library for Korean romanization

Language:PythonLicense:NOASSERTIONStargazers:93Issues:0Issues:0

sherpa

Speech-to-text server framework with next-gen Kaldi

Language:C++License:Apache-2.0Stargazers:484Issues:0Issues:0
License:NOASSERTIONStargazers:40Issues:0Issues:0

PyConKR2023-ModelServing-BentoML

Pycon KR 2023 presentation

Language:HTMLLicense:MITStargazers:13Issues:0Issues:0

open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

License:MITStargazers:1243Issues:0Issues:0

SVCC23_FastSVC

Singing Voice Conversion Challenge 2023 Starter Kit: FastSVC Reimplementation

Language:PythonStargazers:108Issues:0Issues:0

s3prl-vc

S3PRL-VC: A Voice Conversion Toolkit based on S3PRL

Language:PythonLicense:MITStargazers:89Issues:0Issues:0
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:284Issues:0Issues:0

CLAP

Contrastive Language-Audio Pretraining

Language:PythonLicense:CC0-1.0Stargazers:1260Issues:0Issues:0

photometric_optimization

Photometric optimization code for creating the FLAME texture space and other applications

Language:PythonLicense:MITStargazers:504Issues:0Issues:0

DualCycleGAN

Official implementation of DualCycleGAN for nonparallel audio super resolution

Language:PythonLicense:Apache-2.0Stargazers:47Issues:0Issues:0

MB-iSTFT-VITS

Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform

Language:PythonLicense:Apache-2.0Stargazers:405Issues:0Issues:0

SiFiGAN

Official implementation of the source-filter HiFiGAN vocoder

Language:PythonLicense:MITStargazers:234Issues:0Issues:0

nnsvs

Neural network-based singing voice synthesis library for research

Language:PythonLicense:MITStargazers:670Issues:0Issues:0

Awesome-Gaze-Estimation

Awesome Curated List of Eye Gaze Estimation Paper

Stargazers:441Issues:0Issues:0

gpu-burn

Multi-GPU CUDA stress test

Language:C++License:BSD-2-ClauseStargazers:1276Issues:0Issues:0

FACEGOOD-Audio2Face

http://www.facegood.cc

Language:PythonLicense:MITStargazers:1776Issues:0Issues:0

Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Language:PythonStargazers:9841Issues:0Issues:0

Speech-Backbones

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Language:Jupyter NotebookStargazers:547Issues:0Issues:0

BentoML

The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!

Language:PythonLicense:Apache-2.0Stargazers:6821Issues:0Issues:0

YOLOX_AUDIO

Audio event detection model based on YOLOX

Language:PythonLicense:Apache-2.0Stargazers:82Issues:0Issues:0

ort

Accelerate PyTorch models with ONNX Runtime

Language:PythonLicense:MITStargazers:350Issues:0Issues:0

torchgpipe

A GPipe implementation in PyTorch

Language:PythonLicense:BSD-3-ClauseStargazers:790Issues:0Issues:0

code-server

VS Code in the browser

Language:TypeScriptLicense:MITStargazers:66697Issues:0Issues:0

VARA-TTS

Demo audio of VARA-TTS model

Stargazers:20Issues:0Issues:0