MaloMn / wav2vec2-phone-classification

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Phone Classification using Wav2Vec2

This repository contains Speechbrain recipes to fine-tune Wav2Vec2 models on a phone classification task. Following factors were analysed:

  1. Fine-tuning Wav2Vec2,
  2. Pre-training datasets,
  3. Model size,
  4. fine-tuning datasets.

Results of this work have been published at the Interspeech 2024 conference.

Code

  • The recipes folder contains all Speechbrain recipes.
  • Results obtained are available in the confusion-matrix/ folder.

Data

For confidentiality reasons, datasets are not included. This work relies on the C2SI, CommonPhone and BREF corpora.

How to cite

If you use this work, please cite as:

@inproceedings{maisonneuve24,
  author    = {Malo Maisonneuve and Corinne Fredouille and Muriel Lalain and Alain Ghio and Virginie Woisard},
  title     = {{Towards objective and interpretable speech disorder assessment: a comparative analysis of CNN and transformer-based models}},
  year      = 2024,
  booktitle = {Proc. Interspeech 2024}
}

About

License:MIT License


Languages

Language:Jupyter Notebook 57.9%Language:Python 42.0%Language:Shell 0.0%