erusev / parsedown

Better Markdown Parser in PHP

Home Page:https://parsedown.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

HTML entity decoding even when raw HTML not permitted

dblume opened this issue · comments

I have a Parsedown client that sets safe mode, but I'd still like HTML entities decoded. (E.g., θ looks like a θ). I'm trying selective ampersand unescaping after escaping. Have I just made my parsedown unsafe, and do I need to revert?

IMO this should be fine—the main trouble that I'm aware of with HTML entities is when trying to sanitise the content of a HTML element/attribute where things have special meaning. E.g. trying to prevent someone putting javascript: in a link destination is a little tricky if you allow them to insert HTML entities since they may encode any of the letters as well as the colon (there are also additional tricks that can be pulled here due to browsers wanting to "fix" broken looking things).
In this context you're putting things into the content of a HTML entity, and there isn't any special sanitisation going on for things here anyway—so I don't think this should allow anything tricky :)