erusev / parsedown-extra

Markdown Extra Extension for Parsedown

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Server crash on long image url

storeman opened this issue · comments

I've got quite an issue with parsedown. It was a lot of digging, because php crashed and on it's way down it took apache with him. Anyway, I don't know what is happening exactly, but when I'm parsing the example below with parsedown, it just crashes the server. If a remove any line, it is working correctly. The length of the url seems to be the problem. I can create a workaround for this, but I think this is worth mentioning here.

require '../../../vendor/autoload.php';

$str = '![alt](64esabbase64eyppbWc6Y3JvcEZpdCg1MDAsMjYwKTpzaXplcygnKG1heC13aWR'
        . '0aDogNzY4cHgpIDI1dncnLCAnKG1pbi13aWR0aDogMTAwMHB4KSAyMjVweCcsICcxNjBw'
        . 'eCcpOnNyY1NldCgnMTIwLDAnLCAnMTg1LDAnLCAnMjI1LDAnKTpsaWdodGJveChyb3V0'
        . 'ZS01MTUpKn0=64esabbase64/img/diversen/klantfotos/2014-klantenfotos/f'
        . 'acebook-overig/brandenburg-sabine-horseshoe-bend.JPG \'Page, Horsesho'
        . 'e Bend foto Sabine Brandenburg\')';

$parser = new \Parsedown();
$content = $parser->text($str);

This seems to be related to your environment. The following markdown parses just fine:

![alt](64esabbase64eyppbWc6Y3JvcEZpdCg1MDAsMjYwKTpzaXplcygnKG1heC13aWR0aDogNzY4cHgpIDI1dncnLCAnKG1pbi13aWR0aDogMTAwMHB4KSAyMjVweCcsICcxNjBweCcpOnNyY1NldCgnMTIwLDAnLCAnMTg1LDAnLCAnMjI1LDAnKTpsaWdodGJveChyb3V0ZS01MTUpKn0=64esabbase64/img/diversen/klantfotos/2014-klantenfotos/facebook-overig/brandenburg-sabine-horseshoe-bend.JPG 'Page, Horseshoe Bend foto Sabine Brandenburg')

You can try it on the demo page: http://parsedown.org/extra/

It indeed seems to be the environment. I did a few more tests. With the same configuration file on the same system, the script ran fine through the command line. But through the browser it crashes very hard. It takes a while (about 10 seconds) before the error is returned. The only difference between command line and browser is apache, I cannot imagine that this is the issue.

Also tried a few browsers, all the same. I'm going to dig a little deeper, can't stand something like this.

By the way as far as I know parsedown.org also runs Apache. So it could depend on specific Apache version or configuration.

^^ That could be. But I narrowed it down to the following regex on inlineLink:

'/^[(]((?:[^ ()]|[(][^ )]+[)])+)(?:[ ]+("[^"]*"|\'[^\']*\'))?[)]/'

I tried to simplify it, and that was possible, but not without breaking tests and compatibility. I can't yet solve the exact issue, but the first part of the regex (the url) contains a lot of nesting and that is were apache crashes on such a long link. I think any system might run into this issue with longer links.

can you give more info about your environment, OS, PHP version, apache version, etc...

@storeman I'm not sure, but there could be a very memory-intensive regex with certain input. What is your memory limit for PHP in Apache? Also how is the script crashing? Have you turned on error_reporting()?

My Environment:
PHP5.6.8, Apache 2.4.10 (virtual server), Apache2Handler, Windows 8.1.

@hkdobrev It just crashes and restarts apache. No output, no data in the logs (only in apache it . On windows cmd it runs well (with the same INI, thus same memory limit). I also disabled all modules in PHP, this didn't change anything.

One of the main problems is the allowance of braces in the url. This is now matching only closing braces if there is an opening brace. This is inconsisten behavior, because it would allow:

[test](/i/am/an/(url))

But it would not allow

[test](/i/am/an/)url)

I think it would be better to allow urls to be quoted, but that would be a major breaking change, i imagine.