Course Description

0. Tutorial

Tutorial for python and data science packages

  • python review
  • numpy
  • matplotlib
  • PyTorch Tensor

1. Audio file handling

Audio file handling using torchaudio

  • Load audio file(torchaudio.load)
  • Feature extraction(Mel-spectrogram, MFCC)

3. Audio Classification using MLP

Audio MNIST classification using MLP(torch.Linear)

4. CTC

Simple Exercise(model training using CTC loss) for Connectionist Temporal Classification

5. Whisper

Exercise using OpenAI - Whisper and Gradio

6. E2E ASR model finetune with Nemo

Quartznet Model finetune with Nemo(English to Korean)


Exercise for WFST using k2

  • C,L,G transducer
  • composition, determinization

8. E2E ASR model finetune with HuggingFace

Wav2Vec2.0 Model finetune with HunggingFace(English to Korean)

Whisper Model finetune with HunggingFace(English to Korean)

Course Materials

Chapter 1

