bootphon / phonemizer

Simple text to phones converter for multiple languages

Home Page:https://bootphon.github.io/phonemizer/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Memory leak that consumes 5 MB per usage of phonemize function

ollayf opened this issue · comments

commented

Describe the bug
There is a memory leak where each pass of the phonemize function for me takes up at least 5 mb. For some reason I have tried many things but to no avail. This is how I use it in my python code
Screenshot from 2023-07-03 00-28-53

Phonemizer version
Screenshot from 2023-07-03 00-29-38

System
Ubuntu 20.04 LTS
Python 3.8

To reproduce
Screenshot from 2023-07-03 00-29-38

Expected behavior
Everytime the function ends, the memory should be collected in the garbage and released back to the OS. But every time it runs it permanently takes up 5 MB. This 5MB is seen from when i use htop and when i use psutil.Process(os.getpid()).memory_info().rss / 1024 ** 2 in the python code

Additional context

Hi,

See here you may try passing a backend to your phonemize_word function. When creating a new espeak phonemization instance, the code actually copies the espeak shared library somewhere in a temp directory (that's the 5mb). Normally the directory is deleted at exit or when garbage collected (see here this is a bit complex to deal with Linux/Mac/Windows).

Same problem leak memory on every
import phonemizer

phonemes = phonemizer.phonemize({orig_text_wo_stress}, language="en")

don't instantiate one phonemizer backend per call

phonemizer = BACKENDS[backend](