sabberworm / PHP-CSS-Parser

A Parser for CSS Files written in PHP. Allows extraction of CSS files into a data structure, manipulation of said structure and output as (optimized) CSS

Home Page:http://www.sabberworm.com/blog/2010/6/10/php-css-parser

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wrong charset breaks parsing

raxbg opened this issue · comments

I know there are some other issues related to charsets already, but I had an issue where the parser was simply not able to get past a comment block containing multi byte characters. I did not know the correct charset, but that was another issue.

Do you think that something like this will work in all cases raxbg@33b4306 ? It worked in my case and performance is also much better than using mb_substr(). Actually performance does not seem to be affected by this change.

I will be closing this. The method mentioned above has issues. It is much better to use mb_convert_encoding() to convert from whatever the source encoding is back to utf-8 and then use the parser.

Ok, thanks. I hope to revive #116, which does exactly that IIRC…

To be honest, I will be afraid to merge this PR in my production environment now that I have seemingly working charset detection. Mostly because everything seems to be working pretty well for UTF-8 encoded strings. Converting the source to UTF-8 beforehand seems to be enough. If @skodak is willing to check the changes against the latest version that will be okay with me, but simply merging/rebasing the proposed changes seems scary at this point 😛