Pomax / PHP-Font-Parser

This is a PHP code library for parsing TTF/OTF fonts from file.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Memory exceed issue

9tontruck opened this issue · comments

I am trying to get glyph paths of multi letters in one Ajax call.
Simply here is what I want:
[input] $text: "a sentence", $font: 'arial'
[output] array of glyphs

Here is my code:

// * VectorFont.php * //

    require_once("fontparse.php");
    class VectorFont {

    protected $font;

    public function loadFont($font)
    {
        $this->font = new \OTTTFont($font);
    }
    public function getGlyph($letter)
    {
        $data = null;
        if($this->font) {
            $data = $this->font->get_glyph($letter);
        }
        return $data;
    }
    }

// * test.php * //

    getGlyphs('multiple letters', 'arial'); // memory exceed error

    // receive multi-letter text as an input and return array of glyphs
    public function getGlyphs($text, $font)
    {
        $glyphs = array();
        $this->vectorFont = new VectorFont();
        $this->vectorFont->loadFont($font.'.ttf');
        for($i=0 ; $i<mb_strlen($text) ; $i++) {

            // avoid error that occurs if call "get_glyph" with same letter more than once
            $dupIndex = $this->findFirstDuplication(mb_substr($text, $i, 1), $text);

            if($dupIndex >= 0 && $dupIndex < $i) {
                // if duplicated letter existed previously, copy previous data 
                array_push($glyphs, $glyphs[$dupIndex]);

            } else {
                $glyph = $this->vectorFont->getGlyph(mb_substr($text, $i, 1));

                if($glyph) {
                    array_push($glyphs, $glyph);
                } else {
                    array_push($glyphs, "");
                }
            }
        }
        return $glyphs;
    }

First of all, an error occurs if call "get_glyph" with same letter more than once. But I can write work around to avoid this problem, so this is not a big problem

But, I can hardly fix the memory issue. I don't understand why memory usage exceeds the limit even though font object is loaded just once at first. Do you have an idea to fix this problem?

I've filed the error on multiple consults as #7

Thanks.

And the memory issue is quite serious. The library cannot get glyphs of multiple letters in one function call because memory exceeds in the server (more than 10 letters in my case). This is very serious problem because server is going to be super slow if multiple users use this feature at the same time.

this is the nature of PHP. I wrote the scripts initially because I needed individual letters, so the scripts spin up, find a letter, and die again, thus freeing all the memory they were using. This isn't a shaping engine so for full string resolution the scripts would need considerable rewriting. How much memory are you allocating for the PHP process?

Running through the string "After all this time..." with DejaVu sans condensed (bold-oblique) makes PHP use about 30MB before returning. Using "After all this time, we decided that it would be best to leave 1983 to the chroniclers." makes it use 31MB at peak, so I strongly suspect your PHP config is set to the default 12MB, which will make it run out of memory really quickly. If you change that 64MB (a fairly reasonable setting for modern setups) I suspect you'll get a much better result.

I also landed a patch to fix errors on resolving a previously-found character, closing #7.

"Fatal error: Allowed memory size of 134217728 bytes exhausted"
According to the error message, I seem to have 128MB. 11-letter string exceeds the limit.

There should be simple away to fix this.. One font load and find multi entry searches should not drive the memory usage to the limit..

careful with claims that there should "simple ways to fix this" - the solution you describe is what the code already does. Unless you've read through the code and have a pull request with the fix ready, hold off on claims that things are simple =)

That said, I can try to replicate what you're doing; which font are you loading in, and what string are you trying to get the data for? I can see what kind of memory performance I get for the same attempt and maybe using your font will reveal something that I'm not seeing with the fonts I'm using for testing on my machine.

Sorry for mentioning of simply way. I can imagine you have already gone through lot's things. And thank you for all your contributions to help other people :)

I am loading arial.ttf which is 761kb. I just copied it from windows/fonts
And I use "qwertyuiopasdfghj" as an input string.
Please let me know if you have any other questions

Using that same arial.tff, and the following code, I see a peak working memory of about 30MB, with all glyphs retrieved and printed to the console.

<?php
  //UTF: ☺
  require_once("fontparse.php");
  $fonts = array("arial.ttf");
  $letters = "qwertyuiopasdfghj";

  $font = new OTTTFont(array_pop($fonts));
  echo "font header data:\n" . $font->toString() . "\n";

  foreach(str_split($letters) as $letter) {
    $data = $font->get_glyph($letter);
    if($data!==false) {
      $json = $data->toJSON();
    }
    if($json===false) { die("the letter '$letter' could not be found!"); }
    echo "glyph information for '$letter':\n" . $json;
  }
?>

Can you try really long text in your server for me please? like 1234567890-=qwertyuiop[]asdfghjkl;'zxcvbnm,./QWERTYUIOPASDFGHJKLZXCVBNM

If this works on your side, the memory exceed problem must be coming from somewhere else in my code

I used $letters = "1234567890-=qwertyuiop[]asdfghjkl;'zxcvbnm,./QWERTYUIOPASDFGHJKLZXCVBNM And some more text for good measure, because why wouldn't we just test the hell out of this anyway."; and saw a working set of 26MB with peak working set at 30.7MB

What happens when you try the script I used, rather than your own code? If that succeeds, we can be pretty sure it's not the font parser, but some other part of your project's code that makes use of the resulting data.

Sweet. Your library works perfectly as it supposed to.
Memory issue should be closed here because it is coming from somewhere else in my code.
Thank you, sir

It's quite weird tho...
when I do "ini_set("memory_limit","64M");"
I can only write 5 letters
and when I do "ini_set("memory_limit","128M");"
I can only write 10 letters.

It's perfectly proportional. The more letters I am trying to process, the more memory is allocated..
I think it's weird because this problem does not seem to happen in your side. Is there any settings related to garbage collection in the server?

maybe your code is creating more downstream data based on what it gets back from the font parser? At any rate I'm going to close this ticket, since the original issue's been tracked to "not in the parser", but hopefully you can follow the code to where it's building upon the individual glyphs