erusev / parsedown-extra

Markdown Extra Extension for Parsedown

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

support more attributes than just ID or CLASS?

frumbert opened this issue · comments

The parseAttributeData routing is a protected function that looks for attributes with either an ID which starts with a # or a class which starts with a .. Other markdown extra converters (e.g. pandoc) allow for other attributes, so you could have

[my-caption](http://hyperlink.url){.class-1 .class-2 lang=fr title="this is a title tag" rel=nofollow}

which would turn into

<a href="hyperlink.url" class="class-1 class-2" lang="fr" title="this is a title tag" rel="nofollow">my-caption</a>

How can I extend PDE with my own version of the parseAttributeData function so that I can read and render other attributes of the block? Since the routine already splits on spaces to find attributes I might have to pre-escape blocks with spaces - e.g title=this+is+a+title+tag or title=some%20title, which would be acceptable. Something like this:

function parseAttributeData($attributeString)
{
    $Data = array();

    $attributes = preg_split('/[ ]+/', $attributeString, - 1, PREG_SPLIT_NO_EMPTY);

    foreach ($attributes as $attribute)
    {
        if ($attribute[0] === '#') { // id
            $Data['id'] = substr($attribute, 1);
        } else if ($attribute[0] === '.') { // class
            $classes[] = substr($attribute,1);
        } else { // any other attribute
            $attrib = explode('=', $attribute);
            
            if ($attrib[1][0] === "'" || $attrib[1][0] === '"') {
             	$attrib[1] = substr($attrib[1],1,-1); // assuming matching quote on end
            }
            $Data[$attrib[0]] = urldecode("\"{$attrib[1]}\""); // re-quote value
        }
    }

    if (isset($classes)) {
        $Data['class'] = implode(' ', $classes);
    }
    return $Data;
}

this is all very well, but the basic attribute matcher won't match the attributes now, since it's regexp is (?:[#.][-\w]+[ ]*) - e.g one or more of dot or hash and space.

'/(?:
  [#.][-\w:\\\]+[ ]*   # matches ID and class
|                      # or ...
  [-\w:\\\]+(?:=(?:    # matches attribute and value pairs ...
    ["\'][^\n]*?["\']  # ... with quotes
|                      # or ...
    [^\s]+             # ... without quotes
  )?)?[ ]*
)/x'

https://github.com/taufik-nurrohman/parsedown-extra-plugin

@tovic , interesting plugin. Will look into it. I wasn't able to make that regex go in regexr.com - can you explain [#.][-\w:\\\]+[ ]* a bit better?

Any string constructed by words (\w), an -, an : and an \ that is preceded by a # or a ., followed by any space characters or not at all.

commented

It bothers me too and I found https://github.com/michelf/php-markdown can do this.