Boring Task AI's repositories
audiolm-pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
Bark-Voice-Cloning
Bark Voice Cloning and Voice Cloning for Chinese Speech
bark-with-voice-clone
🔊 Text-prompted Generative Audio Model - With the ability to clone voices
big_vision
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
common-voice
Common Voice is part of Mozilla's initiative to help teach machines how real people speak.
common-voice-l10n
l10n for project common-voice, since pontoon sync is too long
Compose_and_Embellish
Official PyTorch implementation of ICASSP 2023 paper "Compose & Embellish: Well-Structured Piano Performance Generation via A Two-Stage Approach"
CorporaCreator
Command line tool to create corpora for Common Voice
ddsp-piano
MIDI Piano synthesizer using DDSP.
eShop
A reference .NET application implementing an eCommerce site
fullcontrol
Python version of FullControl for toolpath design (and more) - the readme below is best source of information
grok-1
Grok open release
Image-Captioning-using-llava-and-llama3
lmage Caption Generator using llava and llama3 through the ollama library
knn-vc
Voice Conversion With Just Nearest Neighbors
nendo-platform
Nendo is an open source platform for AI-driven audio management, intelligence, and generation.
nendo-server
The Nendo API Server.
nendo-web
The Nendo Web Frontend.
nendo_plugin_stemify_demucs
Nendo Plugin for Music Source Separation.
openscad-playground
OpenSCAD Web Playground
parler-tts
Inference and training library for high-quality TTS models.
pontoon
Mozilla's Localization Platform
pontoon-intro
Introduction to Pontoon
so-vits-svc-fork
so-vits-svc fork with realtime support, improved interface and more features.
stable-audio-tools
Generative models for conditional audio generation
VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
whatsapp-api
This project is a REST API wrapper for the whatsapp-web.js library, providing an easy-to-use interface to interact with the WhatsApp Web platform.
whisper
Robust Speech Recognition via Large-Scale Weak Supervision