URI should treat + and %20 the same
Firehed opened this issue · comments
PHP version: 8.0.3 (irrelevant)
Description
Given functionally-equivalent URIs which contain spaces (in the path or query string, at least), I would expect that the normalization process that occurs in Uri::parse()
result in the internal and subsequent __toString()
representations to be identical regardless of what encoding format was used.
How to reproduce
$equivalents = [
'https://example.com/some page',
'https://example.com/some+page',
'https://example.com/some%20page',
];
$parsed = array_map(fn ($str) => (string)(new \GuzzleHttp\Psr7\Uri($str)), $equivalents);
print_r($parsed);
Expected output: all three values are identical (ideally in the form of the last entry in the starting array, with %20
encoded spaces).
Possible Solution
Adding an additional urldecode
at the start of Uri::parse()
seems to help, but causes other tests to fail.
Additional context
Reproduce case in a test:
diff --git a/tests/UriTest.php b/tests/UriTest.php
index 897d8be..5428e73 100644
--- a/tests/UriTest.php
+++ b/tests/UriTest.php
@@ -548,6 +548,8 @@ class UriTest extends TestCase
["/$unreserved?$unreserved#$unreserved", "/$unreserved", $unreserved, $unreserved, "/$unreserved?$unreserved#$unreserved"],
// Encoded unreserved chars are not decoded
['/p%61th?q=v%61lue#fr%61gment', '/p%61th', 'q=v%61lue', 'fr%61gment', '/p%61th?q=v%61lue#fr%61gment'],
+ // Translate +-encoded spaces to %20
+ ['/pa+th?q=va+lue#frag+ment', '/pa%20th', 'q=va%20lue', 'frag%20ment', '/pa%20th?q=va%20lue#frag%20ment'],
];
}
Duplicate of #366.