Vlad Bataev (vladbataev)

vladbataev

Geek Repo

Company:@yandex

Location:Istanbul

Github PK Tool:Github PK Tool

Vlad Bataev's starred repositories

DeepLearningExamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

Language:Jupyter NotebookStargazers:13160Issues:300Issues:833

cleanlab

The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

Language:PythonLicense:AGPL-3.0Stargazers:9334Issues:89Issues:362

demucs

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Language:PythonLicense:MITStargazers:8085Issues:152Issues:539

gpustat

πŸ“Š A simple command-line utility for querying and monitoring GPU status

Language:PythonLicense:MITStargazers:3994Issues:45Issues:121

TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Language:PythonLicense:Apache-2.0Stargazers:3792Issues:78Issues:684

kenlm

KenLM: Faster and Smaller Language Model Queries

Language:C++License:NOASSERTIONStargazers:2473Issues:70Issues:368

awesome-git

A curated list of amazingly awesome Git tools, resources and shiny things

libcudacxx

[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl

Language:C++License:NOASSERTIONStargazers:2294Issues:68Issues:94

hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Language:PythonLicense:MITStargazers:1875Issues:32Issues:162

denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.

Language:PythonLicense:NOASSERTIONStargazers:1632Issues:38Issues:149

TurboTransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

Language:C++License:NOASSERTIONStargazers:1464Issues:41Issues:118

ffmpeg-normalize

Audio Normalization for Python/ffmpeg

Language:PythonLicense:MITStargazers:1232Issues:28Issues:208

svoice

We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.

Language:PythonLicense:NOASSERTIONStargazers:1210Issues:25Issues:89

speech-synthesis-paper

List of speech synthesis papers.

autovc

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

Language:PythonLicense:MITStargazers:983Issues:30Issues:112

klio

Smarter data pipelines for audio.

Language:PythonLicense:Apache-2.0Stargazers:834Issues:21Issues:6

auraloss

Collection of audio-focused loss functions in PyTorch

Language:PythonLicense:Apache-2.0Stargazers:709Issues:18Issues:35

nnsvs

Neural network-based singing voice synthesis library for research

Language:PythonLicense:MITStargazers:678Issues:38Issues:76

setuptools-rust

Setuptools plugin for Rust support

Language:PythonLicense:MITStargazers:589Issues:16Issues:123

tacotron

Audio samples accompanying publications related to Tacotron, an end-to-end speech synthesis model.

Language:HTMLLicense:NOASSERTIONStargazers:529Issues:74Issues:0

Thorsten-Voice

Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be available for every project without any license struggling.

Language:PythonLicense:CC0-1.0Stargazers:521Issues:19Issues:59

WaveGrad

Implementation of WaveGrad high-fidelity vocoder from Google Brain in PyTorch.

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:400Issues:17Issues:26

Prosodylab-Aligner

Python interface for forced audio alignment using HTK and SoX

Language:PythonLicense:MITStargazers:331Issues:27Issues:70

tacotron2-vae

Implementation of "Learning Latent Representations for Style Control and Transfer in End-to-end Speech Synthesis"

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:167Issues:10Issues:6

pitchtron

TTS for pitch-accented language. Korean dialect DB.

Language:PythonLicense:NOASSERTIONStargazers:156Issues:9Issues:8

vcc20_baseline_cyclevae

Voice Conversion Challenge 2020 CycleVAE baseline system

Language:PythonLicense:MITStargazers:132Issues:6Issues:9

emotiontts_open_db

λ‘œλ΄‡μ˜ 감정 및 κ°œμ„±μ„ ν‘œν˜„ν•  수 μžˆλŠ” λŒ€ν™”ν˜• μŒμ„±ν•©μ„± μ˜€ν”ˆμ†ŒμŠ€ ν”Œλž«νΌ

Intelligibility-MetricGAN

Implementation for paper "iMetricGAN: Intelligibility Enhancement for Speech-in-Noise using Generative Adversarial Network-based Metric Learning"

Language:PythonLicense:BSD-3-ClauseStargazers:52Issues:8Issues:3

soxbindings

Python bindings for SoX, aiming to replicate a subset of the command line sox utility.

Language:PythonLicense:MITStargazers:52Issues:2Issues:8

catboost-go

Catboost Go Wrapper