w3c / rdf-canon

RDF Dataset Canonicalization (deliverable of the RCH working group)

Home Page:https://w3c.github.io/rdf-canon/spec/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

\u vs \U

pfps opened this issue · comments

"Characters in the range from U+0000 to U+001F and U+007F (DEL) that are not represented using ECHAR MUST be represented by UCHAR."

It appears to me that U+1 can be written as either \u0001 or \U00000001.

That appears to be the case for all \u0000\uFFFF, that they can also be written as \U00000000\U0000FFFF, respectively. That does pose a quandary.

This does require clarification. Perhaps with something like “using the shortest possible ECHAR representation”. It would want some tests, too.