bootphon / phonemizer

Simple text to phones converter for multiple languages

Home Page:https://bootphon.github.io/phonemizer/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Issues with diacritics in French

CorentinJ opened this issue · comments

Hello, I found some issues with the French "à" when using the espeak backend. I'm running 2.2.1 and espeak-ng-1.49.2.

Command line:

echo "Je vais à" | phonemize -l=fr-fr -b=espeak
ʒə vɛ

The "à" is simply removed. The same goes for é, I haven't tried other characters.

Python:

>>> phonemize('Je vais à', language="fr-fr", backend="espeak")
'ʒə vɛz aaksɑ̃ɡʁav '

The behaviour isn't consistent with the above, but it is problematic in a different way: here the "à" is explicitly spelled out as "a accent grave"

Hi, unfortunately this is a espeak-ng related issue we cannot deal with in phonemizer... I suggest you to directly open an issue there. (I tried with espeak-1.48, espeak-ng-1.49 and espeak-ng-1.50, same result each time)

$ echo "je vais à" | phonemize -l fr-fr -b espeak
ʒə vɛz aaksɑ̃ɡʁav 
$ echo "je vais à pied" | phonemize -l fr-fr -b espeak
ʒə vɛz a pje
$ echo "ô" | phonemize -l fr-fr -b espeak
oaksɑ̃siʁkɔ̃flɛks
$ echo "ô César" | phonemize -l fr-fr -b espeak
oː sezaʁ

Indeed I was able to reproduce with espeak alone and on the latest version. I filed an issue here: espeak-ng/espeak-ng#854

Thank you!