guzzle / psr7

PSR-7 HTTP message library

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

URI should treat + and %20 the same

Firehed opened this issue · comments

PHP version: 8.0.3 (irrelevant)

Description
Given functionally-equivalent URIs which contain spaces (in the path or query string, at least), I would expect that the normalization process that occurs in Uri::parse() result in the internal and subsequent __toString() representations to be identical regardless of what encoding format was used.

How to reproduce

$equivalents = [
  'https://example.com/some page',
  'https://example.com/some+page',
  'https://example.com/some%20page',
];
$parsed = array_map(fn ($str) => (string)(new \GuzzleHttp\Psr7\Uri($str)), $equivalents);
print_r($parsed);

Expected output: all three values are identical (ideally in the form of the last entry in the starting array, with %20 encoded spaces).

Possible Solution
Adding an additional urldecode at the start of Uri::parse() seems to help, but causes other tests to fail.

Additional context
Reproduce case in a test:

diff --git a/tests/UriTest.php b/tests/UriTest.php
index 897d8be..5428e73 100644
--- a/tests/UriTest.php
+++ b/tests/UriTest.php
@@ -548,6 +548,8 @@ class UriTest extends TestCase
             ["/$unreserved?$unreserved#$unreserved", "/$unreserved", $unreserved, $unreserved, "/$unreserved?$unreserved#$unreserved"],
             // Encoded unreserved chars are not decoded
             ['/p%61th?q=v%61lue#fr%61gment', '/p%61th', 'q=v%61lue', 'fr%61gment', '/p%61th?q=v%61lue#fr%61gment'],
+            // Translate +-encoded spaces to %20
+            ['/pa+th?q=va+lue#frag+ment', '/pa%20th', 'q=va%20lue', 'frag%20ment', '/pa%20th?q=va%20lue#frag%20ment'],
         ];
     }

Duplicate of #366.