Karel Vesely's repositories
kaldi-io-for-python
Python functions for reading kaldi data formats. Useful for rapid prototyping with python.
atco2-corpus
A Corpus for Research on Robust Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications
audiomentations
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
CQT_toolbox_python
Constant-Q Transform Toolbox for Python/MATLAB
cylimiter
A C++/Cython audio limiter for Python.
espresso
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit
fixwav
Quick utility to fix WAV files with incorrect lengths
GigaSpeech
Large, modern dataset for speech recognition
gpt4all
gpt4all: open-source LLM chatbots that you can run anywhere
json
JSON for Modern C++
k2
FSA/FST algorithms, intended to (eventually) be interoperable with PyTorch and similar
kaldi-native-fbank
Kaldi-compatible online fbank extractor without external dependencies
kaldi_native_io
python wrapper for kaldi's native I/O
kaldilm
Python wrapper for kaldi's arpa2fst
lhotse
Tools for handling speech data in machine learning projects.
libriheavy
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
mistral-src
Reference implementation of Mistral AI 7B v0.1 model.
mitlm
MIT Language Modeling Toolkit
personalVAD
An unofficial implementation of the Personal VAD speaker-conditioned voice activity detection method. Bachelor's thesis project.
Phonetisaurus
Phonetisaurus G2P
pocolm
Small language toolkit for creation, interpolation and pruning of ARPA language models
sherpa
Speech-to-text server framework with next-gen Kaldi
soundslike_icefall
Icefall recipe for the SoundsLike project under JSALT 2023 (voxpopuli recipe)
vocode-python
🤖 Build voice-based LLM agents. Modular + open source.
wikiextractor
A tool for extracting plain text from Wikipedia dumps