Masterminds / html5-php

An HTML5 parser and serializer for PHP.

Home Page:http://masterminds.github.io/html5-php/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ext-mbstring should be required

alecpl opened this issue · comments

Here's why:

  1. Masterminds\HTML5\Parser\CharacterReference::lookupDecimal() uses mb_decode_numericentity() unconditionally.
  2. Looking at Masterminds\HTML5\Parser\UTF8Utils::convertToUTF8() either iconv or mbstring must be available (if the input encoding is not 'auto').

This would allow to:

  1. Get rid of iconv() use. In my experience mbstring is really a better solution.
  2. Remove use of utf8_decode() which is not really valid and not needed when mbstring is available.
  3. Get rid of the fallback code.

Actually utf8_decode() is deprecated in PHP 8.2, and will be removed later. So, this is more like a bug now.

sorry for the late reply. makes sense what you are suggesting. would be happy to see a PR