Disparity between backends with punctuation
agkphysics opened this issue · comments
Describe the bug
When using the default preserve_punctuation=False
, the Festival backend ignores text that only contains punctuation, whereas the Espeak backend returns the empty string.
Phonemizer version
phonemizer-3.2.1
available backends: espeak-ng-1.50, espeak-mbrola, festival-2.5.0, segments-2.2.1
System
Ubuntu 20.04.4
Linux kernel 5.15.0
Python 3.8.10
To reproduce
from phonemizer import phonemize
print(phonemize([".", "."], language="en-us", backend="festival"))
print(phonemize([".", "."], language="en-us", backend="espeak"))
print(phonemize([".", "."], language="mb-us1", backend="espeak-mbrola"))
Yields output
[]
['', '']
['', '']
Expected behavior
Should output:
['', '']
['', '']
['', '']
Hi, actually with preserve_punctuation=True
another bug occurs:
from phonemizer import phonemize
print(phonemize([".", "."], language="en-us", backend="festival", preserve_punctuation=True))
print(phonemize([".", "."], language="en-us", backend="espeak", preserve_punctuation=True))
print(phonemize([".", "."], language="mb-us1", backend="espeak-mbrola", preserve_punctuation=True))
Yields
['..']
['..']
['', '']
But should be (espeak-mbrola
does not support punctuation)
['.', '.']
['.', '.']
['', '']