dongwon00kim

0

followers

following

stars

dongwon00kim's repositories

ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Language:Jupyter NotebookBSD-3-Clause000

attention-is-all-you-need-paper

Implementation of Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems. 2017.

Language:Jupyter NotebookMIT000

audio-degradation-toolbox

easy-to-use implementation of the ISMIR 2013 Audio Degradation Toolbox

Language:PythonGPL-2.0000

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonMIT000

BaseLib

Language:C++Apache-2.0000

CMGAN

Conformer-based Metric GAN for speech enhancement

Language:PythonMIT000

ffprobe-python

A wrapper of ffprobe command to extract metadata from media files.

Language:PythonNOASSERTION000

hifi-gan-bwe

Unofficial implementation of HiFi-GAN+ from the paper "Bandwidth Extension is All You Need" by Su, et al.

Language:PythonMIT000

K-wav2vec

Language:PythonApache-2.0000

nerf

Code release for NeRF (Neural Radiance Fields)

Language:Jupyter NotebookMIT000

onvif-ipc-server

develop ipc device support Profile S and Profile G

000

riffusion

Stable diffusion for real-time music generation

MIT000

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector, Language Classifier and Spoken Number Detector

Language:PythonMIT000

SMART-G2P

GPL-3.0000

SmartThingsPublic

SmartThings open-source DeviceTypeHandlers and SmartApps code

Language:Groovy000

sony-ai-research

Language:PythonApache-2.0000

stable-diffusion

A latent text-to-image diffusion model

NOASSERTION000

stable-ts-whisper

Stabilizing timestamps of OpenAI's Whisper outputs down to word-level

Language:PythonMIT000

StarGAN-Voice-Conversion-2

A pytorch implementation of StarGAN-VC2

000

tar1090

Provides an improved webinterface for use with ADS-B decoders readsb / dump1090-fa

NOASSERTION000

torch-yin

Yin pitch estimator in PyTorch

MIT000

vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Can be trained on a single GPU!

Language:PythonApache-2.0000

vall-e-EnCodec

An unofficial PyTorch implementation of the audio LM VALL-E

MIT000

VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

MIT000

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonMIT000

whisper.cpp

Port of OpenAI's Whisper model in C/C++

MIT000

whisper_real_time

Real time transcription with OpenAI Whisper.

000