Kei Akuzawa's starred repositories
chronos-forecasting
Chronos: Pretrained (Language) Models for Probabilistic Time Series Forecasting
seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
voice-generator-webui
A multi-speaker, multilingual speech generation tool
pyroomacoustics
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
vits2_pytorch
unofficial vits2-TTS implementation in pytorch
Awesome-LLMOps
An awesome & curated list of best LLMOps tools for developers
Retrieval-based-Voice-Conversion-WebUI
Easily train a good VC model with voice data <= 10 mins!
scrape-youtube
A lightning fast package to scrape YouTube search results
awesome-asr-contextualization
A curated list of awesome papers on contextualizing E2E ASR outputs
audiolm-pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
espnet_onnx
Onnx wrapper for espnet infrernce model
dr-doc-search
Converse with book - Built with GPT-3
torch-audiomentations
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
multimodal-vae-public
A PyTorch implementation of "Multimodal Generative Models for Scalable Weakly-Supervised Learning" (https://arxiv.org/abs/1802.05335)