ensky0's starred repositories
iCanHazShortcut
simple shortcut manager for macOS
silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
audiotools
Object-oriented handling of audio data, with GPU-powered augmentations, and more.
AcademiCodec
AcademiCodec: An Open Source Audio Codec Model for Academic Research
descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
ExpressiveTacotron
This repository provides a multi-mode and multi-speaker expressive speech synthesis framework, including multi-attentive Tacotron, DurIAN, Non-attentive Tacotron, GST, VAE, GMVAE, and X-vectors for building prosody encoder.
SQLCipher-Password-Cracker-OpenCL
Password cracker for SQLCipher v2 using OpenCL
Non-Attentive-Tacotron
This is Pytorch Implementation of Google's Non-attentive Tacotron.
Parallel-Tacotron2
PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Voice_Activity_Detector
A statistical model-based Voice Activity Detection
Voice-Activity-Detection
Efficient voice activity detection algorithms using long-term speech information in C++
whisper.cpp
Port of OpenAI's Whisper model in C/C++
Autoformer
About Code release for "Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting" (NeurIPS 2021), https://arxiv.org/abs/2106.13008
longformer
Longformer: The Long-Document Transformer
performer-pytorch
An implementation of Performer, a linear attention-based transformer, in Pytorch
TransTacoS-RetuneGAN
A toy-like Text-to-Speech for Chinese/Mandarin synthesize, inspired by Tacotron & FastSpeech2 & RefineGAN.