http4s / http4s

A minimal, idiomatic Scala interface for HTTP

Home Page:https://http4s.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Link header: Extension Relations need not be quoted

bblfish opened this issue · comments

My Akka server produces Link relations as in this response

HTTP/1.1 401 Unauthorized
WWW-Authenticate: HttpSig realm="http://localhost:8080/ldes/defaultCF/stream"
Link: <http://localhost:8080/ldes/defaultCF/stream.acl>; rel=acl, <http://localhost:8080/ldes/defaultCF/.acl>; rel=https://www.w3.org/ns/auth/acl#accessControl
Server: reactive-solid/0.3 akka-http/10.2.10
Date: Thu, 04 May 2023 10:48:08 GMT
Content-Length: 0

which are not parsed properly by http4s because the url is not quoted.

    val h2 = """<http://localhost:8080/ldes/defaultCF/stream.acl>; rel=acl,""" +
      """ <http://localhost:8080/ldes/defaultCF/.acl>; rel=https://www.w3.org/ns/auth/acl#accessControl"""
    val pr2: ParseResult[Link] = Link.parse(h2)
    assert(pr2.isLeft, pr2)

This parses correctly

    val h1 = """<http://localhost:8080/ldes/defaultCF/stream.acl>; rel=acl,""" +
      """ <http://localhost:8080/ldes/defaultCF/.acl>; rel="https://www.w3.org/ns/auth/acl#accessControl""""
    val pr1: ParseResult[Link] = Link.parse(h1)
    assert(pr1.isRight, pr1)

The rfc 8288 states that
not having quoted URLs is ok it seems.

Note that extension relation types are REQUIRED to be absolute URIs
in Link header fields and MUST be quoted when they contain characters
not allowed in tokens, such as a semicolon (";") or comma (",") (as
these characters are used as delimiters in the header field itself).

Akka is careful to check that those chars are not included (see LinkValue) which seems a bit wasteful of cpu to save 2 chars.

In RFC 8288 defines the value of the Link header via the following simple ABNF:

Link       = #link-value
link-value = "<" URI-Reference ">" *( OWS ";" OWS link-param )
link-param = token BWS [ "=" BWS ( token / quoted-string ) ]

The problem we are having is with token and in particular the character : .
token is defined in rfc 7230 with the ABNF

token         = 1*tchar
tchar          = "!" / "#" / "$" / "%" / "&" / "'" / "*"
                    / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
                    / DIGIT / ALPHA
                    ; any VCHAR, except delimiters

which as I see does not contain the : character.
That seems to confirm the implementation in http4s.

The problem is that the predecessor of rfc 7230, namely rfc 5988

link-value       = "<" URI-Reference ">" *( ";" link-param )
link-param     = ( ( "rel" "=" relation-types )
                 | ( "anchor" "=" <"> URI-Reference <"> )
                 | ( "rev" "=" relation-types )
                 | ( "hreflang" "=" Language-Tag )
                 | ( "media" "=" ( MediaDesc | ( <"> MediaDesc <"> ) ) )
                 | ( "title" "=" quoted-string )
                 | ( "title*" "=" ext-value )
                 | ( "type" "=" ( media-type | quoted-mt ) )
                 | ( link-extension ) )
   ;[snip]
relation-type  = reg-rel-type | ext-rel-type
reg-rel-type   = LOALPHA *( LOALPHA | DIGIT | "." | "-" )
ext-rel-type   = URI

We are concerned here with ext-rel-type.

Now here the ABNF is insufficiently defined it seems as the URI is defined in
RFC 3986 §Appending A

URI           = scheme ":" hier-part [ "?" query ] [ "#" fragment ]

without quotes, which would indicate that no quotes can be used. But the text
in our older Link RFC rfc5988 §5.3
states that:

Note that extension relation types are REQUIRED to be absolute URIs
in Link headers, and MUST be quoted if they contain a semicolon (";")
or comma (",") (as these characters are used as delimiters in the
header itself).

So there it looks like the reason they should be quoted is if those special characters
turn up.

Conclusion

So it seems that the http4s implementation is correct for the more recent spec. Because
absolute URLs always contain a : they cannot actually be used without quotes. Still
there is a problem of backward compatibility now.

Akka is fixing it's production of Link headers in PR 4267 to make it easier to parse with RFC8288 compliant parsers. It may still be worth making this parser more lenient in what it accepts, as per Postel's Law.