bootphon / phonemizer

Simple text to phones converter for multiple languages

Home Page:https://bootphon.github.io/phonemizer/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Phonemes missing in German phonemizer in Espeak-NG Backend

Pranjalya opened this issue · comments

Describe the bug
On passing the phonemizer with Espeak-NG backend on German words, some of the words had double question marks (??) in place of the phonemes. On further testing, I found it to occur specifically with words containing 'ur' substring.

Phonemizer version
phonemizer-3.2.1
available backends: espeak-ng-1.50, segments-2.2.1

System
Ubuntu Docker Image, Python3.9. Found it to be same in Python3.8 as well.
Issue persisted in Ubuntu 22.04 both with standard Python and Anaconda environment.

To reproduce

phonemizer_backend = EspeakBackend(language='de',
                                    punctuation_marks=';:,.!?¡¿—…"«»“”~/。【】、‥،؟“”؛',
                                    preserve_punctuation=True,
                                    language_switch='remove-flags',
                                    with_stress=True)
print(phonemizer_backend.phonemize(["frankfurt"], strip=True))

Expected behavior
The phones should not be missing and proper phonemes should appear instead of ??.

Hi, this is a espeak related problem, not a phonemizer one :

By the way, the problem is not present with espeak-1.48 so you can try with it:

$ espeak-ng --version
eSpeak NG text-to-speech: 1.50  Data at: /usr/lib/x86_64-linux-gnu/espeak-ng-data
$ espeak-ng --ipa -v de frankfurt
frˈaŋkf??t

$ espeak --version
eSpeak text-to-speech: 1.48.15  16.Apr.15  Data at: /usr/lib/x86_64-linux-gnu/espeak-data
$ espeak --ipa -v de frankfurt
 frˈaŋkfʊɐt