mb_str_split() bug when string contains newline
njandreasson opened this issue · comments
I stumbled upon a bug where mb_str_split() returned an odd response. After some research I found out it happened when input string has one or more newlines.
Example code to reproduce:
$str = "Hello!\nThis is an example with a string with new line.\nI'm just making an example to point out how mb_str_split() was broken in 1.13.0 when string has new lines. Thanks for reading this long example string! Have a nice day!";
var_dump(mb_strlen($str));
var_dump(mb_str_split($str, 153));
Expected output (https://3v4l.org/L6M1K):
int(223)
array(2) {
[0]=>
string(153) "Hello!
This is an example with a string with new line.
I'm just making an example to point out how mb_str_split() was broken in 1.13.0 when string has ne"
[1]=>
string(70) "w lines. Thanks for reading this long example string! Have a nice day!"
}
The problem started in 1.13.0 where #199 was merged.
If I understand the intention of the pull request correctly he wanted to return early when split length is 1, his initial example was:
if (1 === $split_length && 'UTF-8' === $encoding) {
return preg_split('//u', $string, null, PREG_SPLIT_NO_EMPTY);
}
Actual output with polyfill 1.13.0+:
int(223)
array(3) {
[0]=>
string(55) "Hello!
This is an example with a string with new line.
"
[1]=>
string(153) "I'm just making an example to point out how mb_str_split() was broken in 1.13.0 when string has new lines. Thanks for reading this long example string! H"
[2]=>
string(15) "ave a nice day!"
}
The problem seems to be that in the final commit you are not checking if $split_length is actually 1, which results in that preg_split() is always used: