dongwon00kim's repositories
ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
attention-is-all-you-need-paper
Implementation of Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems. 2017.
audio-degradation-toolbox
easy-to-use implementation of the ISMIR 2013 Audio Degradation Toolbox
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
CMGAN
Conformer-based Metric GAN for speech enhancement
ffprobe-python
A wrapper of ffprobe command to extract metadata from media files.
hifi-gan-bwe
Unofficial implementation of HiFi-GAN+ from the paper "Bandwidth Extension is All You Need" by Su, et al.
nerf
Code release for NeRF (Neural Radiance Fields)
onvif-ipc-server
develop ipc device support Profile S and Profile G
riffusion
Stable diffusion for real-time music generation
silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector, Language Classifier and Spoken Number Detector
SmartThingsPublic
SmartThings open-source DeviceTypeHandlers and SmartApps code
stable-diffusion
A latent text-to-image diffusion model
stable-ts-whisper
Stabilizing timestamps of OpenAI's Whisper outputs down to word-level
StarGAN-Voice-Conversion-2
A pytorch implementation of StarGAN-VC2
tar1090
Provides an improved webinterface for use with ADS-B decoders readsb / dump1090-fa
torch-yin
Yin pitch estimator in PyTorch
vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Can be trained on a single GPU!
vall-e-EnCodec
An unofficial PyTorch implementation of the audio LM VALL-E
VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io
whisper
Robust Speech Recognition via Large-Scale Weak Supervision
whisper.cpp
Port of OpenAI's Whisper model in C/C++
whisper_real_time
Real time transcription with OpenAI Whisper.