beckgom

followers

following

stars

Young Han Lee's repositories

VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

NOASSERTION000

vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Apache-2.0000

MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2

BSD-3-Clause000

sherpa

Speech-to-text server framework with next-gen Kaldi

Apache-2.0000

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookMIT000

audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

MIT000

tuning_playbook

A playbook for systematically maximizing the performance of deep learning models.

NOASSERTION000

CLAP

Contrastive Language-Audio Pretraining

CC0-1.0000

korean-romanizer

A Python library for Korean romanization

NOASSERTION000

beckgom.github.com

Language:HTMLMIT000

UUVC

MIT000

vdm

Apache-2.0000

FACEGOOD-Audio2Face

http://www.facegood.cc

MIT000

gpu-burn

Multi-GPU CUDA stress test

BSD-2-Clause000

photometric_optimization

Photometric optimization code for creating the FLAME texture space and other applications

MIT000

Speech-Backbones

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

000

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Apache-2.0000

YOLOX_AUDIO

Audio event detection model based on YOLOX

Language:PythonApache-2.0000

vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

MIT000

cargan

Official repository for the paper "Chunked Autoregressive GAN for Conditional Waveform Synthesis"

MIT000

Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020.

000

SimCLR

PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

Language:Jupyter NotebookMIT000

espnet

End-to-End Speech Processing Toolkit

Apache-2.0000

beckgom-page

Language:Jupyter NotebookMIT000

hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

MIT000

PyTorch-StudioGAN

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

NOASSERTION000

captionr-static-web-app

Real-time captioning and translation app on Azure Static Web Apps

000

wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit

Language:C++NOASSERTION000

Resemblyzer

A python package to analyze and compare voices with deep learning

Apache-2.0000

NVAE

The Official PyTorch Implementation of "NVAE: A Deep Hierarchical Variational Autoencoder" (NeurIPS 2020 spotlight paper)

NOASSERTION000