bootphon / phonemizer

Simple text to phones converter for multiple languages

Home Page:https://bootphon.github.io/phonemizer/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Phonemize gets stuck one some sentences

DasAnish opened this issue · comments

commented

'ज्ञान'
Just one word

But similar issues happen in Bengali as well but I have not looked at it closely.

What backend are you using?

Could you provide us with some more context: OS, phonemizer version, backend used, and if you're using it from the python API, some code sample to reproduce the error.

Hi, I am facing a similar issue in hindi. I am running the model on Linux, with phonemizer version 3.2.1 using python API with espeak as backend

Code:

text='जासूस'
phn = phonemize(
    text,
    language='hi',
    backend='espeak',
    separator=Separator(phone=None, word=' ', syllable='|'),
    njobs=4)

Hi, I too have this issue in Sinhala and using python API with espeak as the backend

could you give me the output of phonemize --version, as well as the exact OS you are using (which linux distribution).

@hadware
This is the result and OS is Ubuntu 18.04

phonemizer-2.2.2
available backends: espeak-ng-1.49.2, segments-2.2.0
uninstalled backends: espeak-mbrola, festival

This is my code:

phonemes = phonemize(text,
                             language=language,
                             backend='espeak',
                             strip=True,
                             preserve_punctuation=True,
                             with_stress=with_stress,
                             punctuation_marks=self.punctuation,
                             njobs=njobs,
                             language_switch='remove-flags')

OK. Could you try updating phonemizer to the latest version (3.0.1) ? I tested this error on my setup and couldn't reproduce. This is my setup:

(On ubuntu 20.04)
phonemizer-3.0.1
available backends: espeak-ng-1.50, espeak-mbrola, festival-2.5.0, segments-2.2.0

To update phonemizer, run pip install -U phonemizer

@hadware done that but still not working.

every time it's stuck on the same line:-

භුගෝලීය වශයෙන් බැලූ කළද ශාන්ත කීටස් හා නේවිස් යනු ලීවර්ඩ් දූපත් හි කොටසක් වේ

only some texts was not working

it is getting stuck for some Hindi words. Can someone please help?
Ex. रुचि , दिनांक , नौ

Hi, I suspect this to be an espeak bug, not a phonemizer one. So please make sure you have espeak-ng-1.50 installed.

Actually I have:

$ phonemize --version
phonemizer-3.2.1
available backends: espeak-ng-1.50, espeak-mbrola, festival-2.5.0, segments-2.1.3

And the following script works:

from phonemizer import phonemize
from phonemizer.separator import Separator

# gives 'ɟaːsuːs '
phonemize('जासूस', language='hi', backend='espeak')

# gives 'ɾʊcɪ dɪnãk nɔː '
phonemize('रुचि , दिनांक , नौ', language='hi', backend='espeak')

# gives 'bʰuɡoːliːjə wɐsəjen bæluː kɐɭədə saːntə kiːʈəs haː neːwis jɐnu liːwərɖ duːpət hi koʈəsək weː '
phonemize('භුගෝලීය වශයෙන් බැලූ කළද ශාන්ත කීටස් හා නේවිස් යනු ලීවර්ඩ් දූපත් හි කොටසක් වේ', language='si', backend='espeak')  

Hi, Thanks for the reply. This is indeed espeak-ng version issue. In centos 8 - the latest available version is 1.49.2. Building it from a source solves the issue.