bootphon / phonemizer

Simple text to phones converter for multiple languages

Home Page:https://bootphon.github.io/phonemizer/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

There is a difference between espeak-ng and phonemizer

hakan-demirli opened this issue · comments

I have been using Phonemizer in a hobby project that I am currently trying to port to C++. I have read the code, and apart from input sanitization and basic pre/post-processing, there are only Espeak-ng system library bindings that access the same files C++ use. So, I can't see a reason why the outputs are different.

Phonemizer
həloʊ wɜːld
Espeak-ng
həlˈəʊ wˈɜːld

I don't know much about phonetics. But, the only difference I see is o <-> ˈə and ɜ <-> ˈɜ. Can I just map all those differences and call it a day?

Phonemizer: Latest version
Espeak-ng: Latest version
OS: Pop!_OS 22.04 LTS

Python Code:

text = "hello world"
lang = 'en-us'
phonemes = phonemize(text,
                        language=lang,
                        backend='espeak',
                        strip=True,
                        preserve_punctuation=True,
                        with_stress=False,
                        njobs=4,
                        punctuation_marks=';:,.!?¡¿—…"«»“”()',
                        language_switch='remove-flags')
print(phonemes)

CPP code:

// gcc test-espeak.c -lespeak-ng -o test-espeak

#include <string.h>
#include <malloc.h>
#include <espeak-ng/speak_lib.h>

espeak_AUDIO_OUTPUT output = AUDIO_OUTPUT_SYNCHRONOUS;
char *path = NULL;
void* user_data;
unsigned int *identifier;

int main(int argc, char* argv[] ) {
  char text[] = {"hello world"};
  int buflength = 500, options = 0;
  unsigned int position = 0, position_type = 0, end_position = 0, flags = espeakCHARS_AUTO;
  espeak_Initialize(output, buflength, path, options );
  espeak_VOICE voice;
  memset(&voice, 0, sizeof(espeak_VOICE)); // Zero out the voice first
  const char *langNativeString = "en"; // Set voice by properties
  voice.languages = langNativeString;
  voice.name = "US";
  voice.variant = 2;
  voice.gender = 1;
  espeak_SetVoiceByProperties(&voice);
  espeak_SetPhonemeTrace(espeakPHONEMES_IPA,NULL);
  printf("Saying  '%s'...\n", text);
  espeak_Synth(text, buflength, position, position_type, end_position, flags, identifier, user_data);

  printf("Done\n");
  return 0;
}

Hi, a very quick answer. To keep the ' (this is phone accentuation) in the phonemizer output, use the with_stress=True parameter.

For the o <-> ˈə stuff I really don't know... Are you sure you are using the same library both sides?

I am sure the libraries are the same. C++ is using a CMakeLists file with an absolute path to *.so files. I have followed the phoemizer in debugger up until here:

f_text_to_phonemes = self._library.espeak_TextToPhonemes

and verified the library dependency by removing all espeak libraries and running the script again.

Thank you for explaining the accentuation. Now the outputs are looking quite close, and my project is working as expected. o <-> ə difference is not critical for me since this is a hobby project.