asr hacktoberfest kaldi kaldi-asr kaldi-librispeech speech speech-recognition

Kaldi Notes

This repo contains some of the stuff I love to refer back to related to Kaldi. I add some important links, lecture that helps in using Kaldi. PDFs are also present in this repository, which are my notes. Kaldi toolkit has lot of resources and information spread out on the internet, despite the presence of many such similar respositories, many links are often outdated as of 2022. This repository will serve as an list for some great links I found online which can be helpful for learning Kaldi and it's internal workings. This should help in demystifiying the working of Kaldi.

I won't accept Pull Requests for fixing Spelling Errors. I consider it the responsibility of other uses to raise meaningful Pull Requests to help with the cause of learning Kaldi

Kaldi Lectures

These links contain lectures given by Dan Povey, in the form of Kaldi lectures.

https://github.com/Agrover112/IIITH-Speech-Internship/tree/master/Kaldi/Lectures

Text Preprocessing

Text preprocessing is an important aspect in ASR when preparing transcripts from raw-data or cleaning transcripts for preparation of lexicon files, doing preprocessing in Linux can be helpful and prevent further errors downstream in the pipeline.

Detaied explanation of prepare_dict.sh for Lexicon creation
http://jrmeyer.github.io/misc/2019/03/02/Linux-textProc-Notes.html
Get first column : This can be helpful while splitting lexicon files.
Fix space indentation and get the second column file : cat file_lexicon.txt | tr "\t" " " | tr -s " " | cut -d" " -f1 | sort | uniq
tr command
Unicode wierd quotation symbols : UTF-8 and us-ascii have some differences such as curly and straight quotations, ... and one ... dot symbol, grave accents ,etc which might or might not be required in your text file. This link should help you understand how they are different despite looking similar to the untrained eye.
Convert numbers in transcript to words, also for Indic languages

Kaldi miscellaneous

Theory

Some links related to theory WFST

Maximum Likelihood Estimation

Signal Processing

Introduction to Digital Signal Processing

Decoding

The decoding process is important to understand , as it is responsible for the final output. Kaldi creates such decoding graphs via compositions of lattices. I think of compositions as dot product of Tensors.

Common Kaldi Errors & Questions

A list of some great errors faced by Kaldi users, I bookmarked. Note: You might need to join the Google Group for viewing them.

About

Resources helpful for Kaldi

asr hacktoberfest kaldi kaldi-asr kaldi-librispeech speech speech-recognition