Thodoris Kouzelis's repositories
DreamSound
Code for Investigating Personalization Methods in Text to Music Generation
DisfluentFA
A Weakly Supervised Forced Alignment for disluent speech
KaldiLongAligner
Speech to Text Alignment tool implemented with Python and Kaldi
awesome-LoRA
A curated list of Parameter Efficient Fine-tuning papers with a TL;DR
Reading-Diffusion
A collection of interesting papers on Diffusion Models
localdiff-demo
A repo containing demo for Enabling Local Editing in Diffusion Models by Joint and Individual Component Analysis
ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
Folder-Structure-Conventions
Folder / directory structure options and naming conventions for software projects
kaldi
kaldi-asr/kaldi is the official location of the Kaldi project.
kaldi-long-audio-alignment
Long audio alignment using Kaldi
presentations
This is a repo where to save Marp presentations
sail_align
SailAlign is an open-source software toolkit for robust long speech-text alignment implementing an adaptive, iterative speech recognition and text alignment scheme that allows for the processing of very long (and possibly noisy) audio and is robust to transcription errors. It is mainly written as a perl library but its functionality also depends on freely available software, namely HTK, srilm and sclite.
secretsanta
Host secret santa without leaking your guests' informations 🎄
wavetransformer
Code base for WaveTransformer: A novel architecture for automated audio captioning