Ruoho Ruotsi's repositories
speech-training-recorder
Simple GUI application to help record audio dictated from given text prompts, for use with training speech recognition or speech synthesis.
neural-animation
Implementing neural art on video
1D-StateSpace
This repository contains the implementation of an efficient joint beat, downbeat, tempo, and meter tracking system using a compact 1D probabilistic state space and a jump-back reward technique. ICASSP 2022.
Anime4K
A High-Quality Real Time Anime Upscaler
audio-diffusion-pytorch
Audio generation using diffusion models, in PyTorch.
beat-tracking-tcn
An implementation of Davies & Böck's beat-tracking temporal convolutional network
DeforumStableDiffusionLocal
Local version of Deforum Stable Diffusion, supports txt settings file input and animation features!
libACA
C++ code accompanying the book "An Introduction to Audio Content Analysis" (www.AudioContentAnalysis.org)
mtandseq2seq-code
Code examples for CMU CS11-731, Machine Translation and Sequence-to-sequence Models
numpy-ml
Machine learning, in numpy
omnizart
Omniscient Mozart, being able to transcribe everything in the music, including vocal, drum, chord, beat, instruments, and more.
open-bible-scripts
scipts for working with open.bible data
pytorch-softdtw
An implementation of SoftDTW for PyTorch.
riffusion-inference
Stable diffusion for real-time music generation
ruohorecords
Miscellaneous label prep tasks (logo design, websites, etc)
silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity and Number Detector
stable_diffusion_playground
Playing around with stable diffusion. Generated images are reproducible because I save the metadata and latent information. You can generate and then later interpolate between the images of your choice.
vosk
VOSK Speech Recognition Toolkit
whisper.cpp
Port of OpenAI's Whisper model in C/C++
x-rhythm-can
Creative Adversarial Network for generating Dance Music Rhythm Patterns