speechlmscore_tool

Implementation of "SpeechLMScore: Evaluating speech generation using speech langauge model" https://arxiv.org/abs/2212.04559

Installation

You can install required python packages as:

python setup.py install

Usage

Download pretrained models

Download these pretrained models and update their path in run.sh.
Note: tokens.txt is located with speech ulm model.

Pretrained Hubert
Pretrained Hubert-kmeans
Speech ulm

Run the following command to download all the above models:

./download_pretrained_models.sh

Compute SpeechLMScore using pretrained models

Generates speechlmscore for each file in audio_dir in file ppl.
Audio files with sampling rate of 16kHZ are supported.
Note: for using audio files other than .wav set ext variable is run.sh.

audio_dir=<folder containing audio>
layer=<Hubert layer to extract features>

./run.sh ${audio_dir} ${layer}

Train speech language models

Additionally speech language model can be trained and used for evaluation as well. More Details

Takaaki-Saeki / speechlmscore_tool