MB_CASE_FOLD doesn't handle ẞ character properly
colinodell opened this issue · comments
Summary
It seems that this polyfill incorrectly case-folds the ẞ
character to ß
instead of ss
.
Details
Running this code on PHP 7.3 with mbstring
installed:
echo mb_convert_case('ẞ', MB_CASE_FOLD);
Will produce this output: ss
However, when I run that same input through this polyfill (master branch):
echo \Symfony\Polyfill\Mbstring\Mbstring::mb_convert_case('ẞ', \Symfony\Polyfill\Mbstring\Mbstring::MB_CASE_FOLD);
I instead get ß
. This does not match mbstring
's behavior on PHP 7.3.
It looks like this occurs with both U+1E9E
LATIN CAPITAL LETTER SHARP S and U+00DF
LATIN SMALL LETTER SHARP S. Both of these should return ss
when case-folded.
PHP 7.3 uses full case folding.
This is implemented in tchwork/utf8, so we could borrow the map from there:
https://github.com/tchwork/utf8/blob/30ec6451aec7d2536f0af8fe535f70c764f2c47a/src/Patchwork/Utf8.php#L168-L170
Up for a PR?