Implementation feedback

Question

Implementation feedback

pchampin opened this issue a year ago · comments

Pierre-Antoine Champin commented a year ago

Here is a list of remarks that I noted while following the spec to implement the algorithm in Rust.
Rather than creating a bunch of small issues, I kept everything in one big issue. We can split some items out if they need a separate discussion.

About code point order
- it might be useful to explain how to point out that, if strings are internally encoded as UTF-8, then comparing them byte-wise (using standard "lexicographic" order) gives the same order as code point order
  - this is confirmed by Wikipedia and the Rust documentation says it is, and by
  - or at least, we should advise developers to check whether the comparison operators on strings in the language they use is using code-point order or another one
algo Canonicalization
- step 3.2 "including repetitions" is a bit mysterious. I assume that it means that I must authorize n to occur several times in the list mapped to $h_f(n)$ , but I don't even see when this is supposed to happen
  - or does it just mean that $h_f(n)$ may occur several times? In which case this is not consistent with the definition of "hash to blank nodes" (map of hash to lists of nodes)
  - as a matter of fact "add $h_f(n)$ and n" to the map is a bit amgiguous
- step 5.1 is a bit ambiguous : it mentions the Hash n debgree algorithm, but only to indicate the expected type of elements of the list, it is not meant to be called here (but in step 5.2.4)
  - the 'explanation' is equally confusing, because its says "this list establishes an order"
    - I suggest replacing the explanation with "this list will be populated by step 5.2, and will establish an order..."
algo Hash Related Blank node
- ~~it might be useful to hint that the issuer passed to this algo is not mutated by the algorithm~~
- calling Hash 1st Degree Quads in step 4 raises the question of optimizing it if we are going to call it several times with the same node
  - if some form of memoization could help, this should probably be hinted -- in particular, we could store it in the c14n state...
  - as a matter of fact, this is what my implementation does
algo Hash N-degree Quads
- ~~it might be useful to hint that the issuer passed to this algo is not mutated by the algorithm~~