vladbataev

Vlad Bataev's starred repositories

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookMIT33932 316 423

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.023536 217 3588

spotify-downloader

Download your Spotify playlists and songs along with album art and metadata (from YouTube if a match is found).

Language:PythonMIT15745 189 1448

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Language:Jupyter NotebookNOASSERTION10581 141 338

ml-engineering

Machine Learning Engineering Open Book

Language:PythonCC-BY-SA-4.010290 107 18

lyra

A Very Low-Bitrate Codec for Speech Compression

Language:C++Apache-2.03803 113 125

adapters

A Unified Library for Parameter-Efficient and Modular Transfer Learning

Language:Jupyter NotebookApache-2.02480 30 376

wemake-python-styleguide

The strictest and most opinionated python linter ever!

Language:PythonMIT2473 31 1089

AudioLDM

AudioLDM: Generate speech, sound effects, music and beyond, with text.

Language:PythonNOASSERTION2346 42 102

audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

Language:PythonMIT2331 60 167

DeepFilterNet

Noise supression using deep filtering

Language:PythonNOASSERTION2210 32 268

AudioLDM2

Text-to-Audio/Music Generation

Language:PythonNOASSERTION2164 44 66

audio-ai-timeline

A timeline of the latest AI models for audio generation, starting in 2023!

1874 169 4

eng-handbook

A developer's guide to management: an open-sourced handbook for leading software engineering teams.

GPL-3.01536 73 2

awesome-talking-head-generation

1307 72 2

ftp

FTP client package for Go

Language:GoISC1265 26 144

CLAP

Contrastive Language-Audio Pretraining

Language:PythonCC0-1.01264 29 84

versatile_audio_super_resolution

Versatile audio super resolution (any -> 48kHz) with AudioSR.

Language:PythonMIT1007 24 52

YaFSDP

YaFSDP: Yet another Fully Sharded Data Parallel

Language:PythonApache-2.0793 14 3

pytriton

PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.

Language:PythonApache-2.0692 17 72

hidet

An open-source efficient deep learning framework/compiler, written in python.

Language:PythonApache-2.0635 17 81

audio-dataset

Audio Dataset for training CLAP and other models

Language:Python605 21 57

speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

511 30 2

COMET

A Neural Framework for MT Evaluation

Language:PythonApache-2.0453 17 161

knn-vc

Voice Conversion With Just Nearest Neighbors

Language:PythonNOASSERTION431 14 35

WaveDiff

Official Pytorch Implementation of the paper: Wavelet Diffusion Models are fast and scalable Image Generators (CVPR'23)

Language:PythonAGPL-3.0356 12 13

cookbook

Deep learning for dummies. All the practical details and useful utilities that go into working with real models.

Language:PythonApache-2.0215 8 11

DiscreteSpeechMetrics

Reference-aware automatic speech evaluation toolkit

Language:PythonMIT80 4 2

podcasts-dataset

dataset of podcasts and episodes

Language:Python13 30

sd-benchmarks

Stable Diffusion inference benchmarks

Language:Python10 40