danijel3 / CTMtoEMU

A python program to convert CTM files (usually generated by Kaldi) into an EMU SDMS database.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CTMtoEMU

A python program to convert CTM files (usually generated by Kaldi) into an EMU SDMS database.

Requirements

This program is pretty self-contained and has very little requirements, with the exception of a transcriber.

These can be installed using pip or easy_install:

  • tqdm - for progress bars
  • pyphen - for syllabification

These have to be installed in some other way:

  • R - required for signal processing (formant estimation)
  • wrassp - speech processing library installable from within R
  • phonetisaurus - for G2P

About Phonetisaurus

The project is available in the standard repo:

https://github.com/AdolfVonKleist/Phonetisaurus

We only need the C++ binaries, so no python bindings are necessary.

If you have a working Kaldi installation, you may simply run ./extras/install_phonetisaurus.sh in the tools directory and use the binary there.

The model for Phonetisaurus for Polish G2P is included here (in the model.fst file).

Usage

usage: CTM_to_Emu.py [-h] [--wav-scp WAV_SCP] [--utt2ses UTT2SES] [-o]
                     [-n NAME] [-r RATE] [--rm-besi RM_BESI]
                     [--phonetisaurus PHONETISAURUS] [--g2p-model G2P_MODEL]
                     [--feat FEAT] [-s] [--segs SEGS] [--split SPLIT]
                     out_dir words_ctm phones_ctm [wav]

Program to convert CTM files (usually generated by Kaldi) into a folder
structure used by EMU-SDMS

positional arguments:
  out_dir               Output directory
  words_ctm             CTM containing words
  phones_ctm            CTM contatinng phonemes
  wav                   Wave file corresponging to CTMs (only if single file,
                        use --wav-scp for multiple files)

optional arguments:
  -h, --help            show this help message and exit
  --wav-scp WAV_SCP     List of WAV files if CTM contains references to
                        multiple files. Uses Kaldi wav.scp format.
  --utt2ses UTT2SES     List of utterance to session mappings (similar to
                        utt2spk).
  -o, --overwrite       Overwrite output directory.
  -n NAME, --name NAME  Name of the database.
  -r RATE, --rate RATE  Samplerate of WAV file
  --phonetisaurus PHONETISAURUS
                        Path to the phonetisaurus-g2pfst program.
  --g2p-model G2P_MODEL
                        Path to the FST G2P model.
  --feat FEAT           Compute extra features (comma separated) using R
                        package "wrassp", e.g.: forest, ksvF0, mhsF0, rmsana,
                        zcrana
  -s, --symlink         Use symlinks instead of copying audio to database
  --segs SEGS           Use segments file

About

A python program to convert CTM files (usually generated by Kaldi) into an EMU SDMS database.


Languages

Language:Python 97.3%Language:R 2.7%