dongwon00kim

dongwon00kim

Geek Repo

Github PK Tool:Github PK Tool

dongwon00kim's repositories

ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

attention-is-all-you-need-paper

Implementation of Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems. 2017.

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

audio-degradation-toolbox

easy-to-use implementation of the ISMIR 2013 Audio Degradation Toolbox

Language:PythonLicense:GPL-2.0Stargazers:0Issues:0Issues:0

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

CMGAN

Conformer-based Metric GAN for speech enhancement

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

ffprobe-python

A wrapper of ffprobe command to extract metadata from media files.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

hifi-gan-bwe

Unofficial implementation of HiFi-GAN+ from the paper "Bandwidth Extension is All You Need" by Su, et al.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

nerf

Code release for NeRF (Neural Radiance Fields)

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

onvif-ipc-server

develop ipc device support Profile S and Profile G

Stargazers:0Issues:0Issues:0

riffusion

Stable diffusion for real-time music generation

License:MITStargazers:0Issues:0Issues:0

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector, Language Classifier and Spoken Number Detector

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
License:GPL-3.0Stargazers:0Issues:0Issues:0

SmartThingsPublic

SmartThings open-source DeviceTypeHandlers and SmartApps code

Language:GroovyStargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

stable-diffusion

A latent text-to-image diffusion model

License:NOASSERTIONStargazers:0Issues:0Issues:0

stable-ts-whisper

Stabilizing timestamps of OpenAI's Whisper outputs down to word-level

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

StarGAN-Voice-Conversion-2

A pytorch implementation of StarGAN-VC2

Stargazers:0Issues:0Issues:0

tar1090

Provides an improved webinterface for use with ADS-B decoders readsb / dump1090-fa

License:NOASSERTIONStargazers:0Issues:0Issues:0

torch-yin

Yin pitch estimator in PyTorch

License:MITStargazers:0Issues:0Issues:0

vall-e-EnCodec

An unofficial PyTorch implementation of the audio LM VALL-E

License:MITStargazers:0Issues:0Issues:0

VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

License:MITStargazers:0Issues:0Issues:0

vall-ef

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Can be trained on a single GPU!

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

whisper.cpp

Port of OpenAI's Whisper model in C/C++

License:MITStargazers:0Issues:0Issues:0

whisper_real_time

Real time transcription with OpenAI Whisper.

Stargazers:0Issues:0Issues:0