tjysdsg / tone_classifier

Mandarin Tone Classifier

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

  1. Place your own phone_ctm.txt file in project root dir, or use the default one generated from https://github.com/tjysdsg/aidatatang_force_align on AISHELL-3 data
  2. Run
  python feature_extration.py

to collect required statistics (phone start time, duration, tones, etc). Results are saved to utt2tones.json

  1. Run
  python trian/embedding/split_wavs.py

to split train, test, and validation dataset for embedding model training

The test utterances used in the paper are listed in test_utts.json

  1. Run
python train/train_embedding.py

to train embedding model, the results are in exp/

Mel-spectrogram cache is generated at exp/cache/spectro/wav.scp and exp/cache/spectro/*.npy

[Optional] Train an end-to-end tone recognizer

After step 3,

  • Run the following at e2e_tone_recog/
./run.sh

About

Mandarin Tone Classifier


Languages

Language:Jupyter Notebook 57.2%Language:Python 39.5%Language:Shell 3.2%