🗃️ This repository contains the code to replicate the experiments in the paper "Mispronunciation detection using self-supervised speech representations" presented at the SLaTE Workshop held in Dublin 2023.
💡 We study the use of different self-supervised models (SSL) for the task of mispronunciation detection for second language learners. We compare two downstream approaches:
- PR: Training the model for phone recognition using native data from TIMIT database.
- MD: Training the model directly for the target task using non-native Enligsh data from EpaDB and L2Arctic databases.
We also compare the performance of each approach using representations extracted from a traditional HMM-DNN system. The SSL models that we compare are: HuBERT, LightHuBERT small, WavLM large and WavLM+.
⚙️ We implement the SSL models using the s3prl toolkit. The code consists of two directories containing two different pronunciation assessment downstreams, the PR and the MD approach in the paper. Directory pr
contains the PR approach and directory md
contains the MD approach. Both directories contain:
- Data folders: there is one data folder for each database used (i.e data_epa contains the data for EpaDB database). We use TIMIT, EpaDB and L2Arctic in the PR approach and only the last two in the MD approach. Data folders contain alignement files for phones and labels, a lengths file used for bucketing, the splits of the data used for the experiments, the list of phones and the weights for each phone in this experiments.
- Config file: the config file contains the parameters of the experiment.
- Scripts for running the experiments as required by s3prl.
- The EpaDB dataset downloaded. You can ask for the most updated version at jvidal@dc.uba.ar.
- The L2Arctic dataset downloaded. You can download the most updated version from this link.
- The s3prl toolkit downloaded or cloned.
An exact frozen version of the databases we used in this experiments can be downloaded from this link.
- Clone s3prl
git clone https://github.com/s3prl/s3prl.git
- Clone this repository:
git clone https://github.com/JazminVidal/ssl-mispron.git
and move or link the pr and the md directories to the downstream directory inside s3prl.
3. Download the databases. Copy or link them to the corresponding data folder inside the downstream folder.
To obtain results for the PR systems. From the root of the s3prl directory run the following command:
python3 run_downstream.py -m train -n TEST -u upstream -d pr
where upstream can be any of the upstreams implemented in s3prl, for example hubert.
To obtain results for the MD systems. From the root of the s3prl directory run the following command:
python3 run_downstream.py -m train -n TEST -u upstream -d md
where upstream can be any of the upstreams implemented in s3prl, for example hubert.