rize / UriTemplate

PHP URI Template (RFC 6570) supports both URI expansion & extraction

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

URI Extraction Issue

ConnectGrid opened this issue · comments

I've just discovered URI-Template and it's a tremendous blessing for my applications! Thank you for this library.

While I'm still learning the behaviour of URI template formatting logic, I would expect the following to work:

URI:
/XYZ.shop/department/Electronics

Template:
/{app}.shop/department/{slug}

Expected Results:

  • app: XYZ
  • slug: Electronics

However, it's not liking the XYZ.shop in the first segment. Internally, it's assigning XYZ.shop to the {app}, but then ultimately failing when in Strict mode.

Should the above template work correctly, or am I misunderstanding the specification?

(the internal Regex is so complex, I don't dare mess with it.)

Hi @ConnectGrid, the spec only defines the expansion process which allows inputs with . values and copies them exactly as-is.

$uri->expand("/{app}.shop", ["x.y.z"])
# >> "/x.y.z.shop"

$uri->extract("/{app}.shop", "/x.y.z.shop")
# ["app" => "x.y.z.shop"]

As you can see. For the extraction to work correctly. It'd require some look-ahead to see if it should consume the next token (beyond x.y.z) or not (it's greedy atm). The problem here is that we didn't implement the lookahead feature though, as it's not part of the spec ~ but maybe we should consider 🤔 .

In order for your example to work. It might require "Label with Dot-prefix expansion" operator (i.e. {.shop}).

$uri->extract("/{app}{.shop}/department/{slug}", "/XYZ.shop/department/Electronics");
[
    [app] => XYZ
    [shop] => shop
    [slug] => Electronics
]

Hope it helps 👍

My assumption is that only the {braced} portions are considered, and the rest of the characters are treated as Literals. When I do a line-by-line debugging, this is what I see. The string .shop/department/ is stored correctly as a Literal, with the {app} and {slug} portions correctly stored as Expressions, but it ultimately fails to return the correct parsing when it goes to evaluate each Expression and Literal.

Why is the . preceding shop significant? Maybe you tried to explain it, but that's where my confusion remains. Even though it's outside of braces, does it participate in the Regex?

I didn't explain it very well in the previous answer. Let's me try again to explain the issue.

$uri->expand("/{app}.shop", ["app" => "a.b.c.shop"]);

# /a.b.c.shop.shop
#
# 0 - "{app}" (expression)
# 1 - ".shop" (literal)

When {app} tries to match the URI, it'll match everything (i.e. a.b.c.shop.shop) even before checking the literal .shop (this is the main problem that you're referring) because . is a valid values in expansion (greedy).

The possible solution is to perform some lookahead and see if the next node is literal or not and combine to the current regex (or use a workaround by explicitly specify {.shop}).

I might try to implement this when I have time.

Okay, I see where the issue is occurring.

URI value of XYZ.shop/department/Electronics and template of {app}.shop/department/{slug}...

For the {app} variable, the generated regex is:

#(?:[a-zA-Z0-9\-\._~!\$&'\(\)\*\+,;=%:@]+|%(?![A-Fa-f0-9]{2}))*(?:,(?:[a-zA-Z0-9\-\._~!\$&'\(\)\*\+,;=%:@]+|%(?![A-Fa-f0-9]{2}))+)*#

head explodes!

This seems to ignore the brace boundaries { and } and allows matching ., so it's grabbing XYZ.shop as the value of {app}.

I'm guessing your explanation touched on this with regards to look-ahead. (I'm not a Regex master). It seems to be matching everything up until the next forward slash /.

Was your suggestion of {.shop} variable a workaround so that it actively participates in the matching, rather than being a Literal? Am I on the right track?

I've employed your suggestion of {.shop} and it's actually going to be a useful variable, so thanks for that.

Closing this issue for now.

Exactly! I'll try to work on this and let's you know when it's fixed. Thanks for using the lib and reporting your use case (which is interesting) 👍