w3c / rdf-canon

RDF Dataset Canonicalization (deliverable of the RCH working group)

Home Page:https://w3c.github.io/rdf-canon/spec/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Algorithm interface vs DoS defense

jyasskin opened this issue · comments

https://www.w3.org/TR/rdf-canon/#canon-algorithm says

The canonicalization algorithm converts an input dataset into a canonicalized dataset.

However, https://www.w3.org/TR/rdf-canon/#canon-algo-algo says

Implementations MUST prevent against potential denial-of-service attacks.

The interface claims that they're still going to produce a canonicalized dataset after detecting a denial-of-service attack, but the algorithm doesn't say anything about how they're supposed to do that. I suspect the interface should change to say that the algorithm returns either a canonicalized dataset or a failure.

Hi @jyasskin. The language about denial of service attacks is related to the potential for providing a poisoned dataset, and that implementations must guard against this. Possible ways to mitigate this are described in 7.1 Dataset Poisoning, but these are not exhaustive and only suggest ways that might be used to detect an attack.

Note that we were careful to not define an API for doing canonicalization, nor to specify how errors are handled; this is considered out of scope for this particular spec, but I would expect that other specifications leveraging this spec would describe language appropriate for their situation. If we were to be more API-centric, we might say something about rejecting a promise or raising an error, which would short-circuit the actual return from the algorithm.

My own implementation raises an exception when such a dataset is detected.

Adding "or raises an error" to the interface definition would make that clear. It's not about defining an API for a programming language, but rather about defining an interface that other specifications can use.

My own implementation raises an exception when such a dataset is detected.

FWIW, this is exactly what my implementation does as well... (and I did not sync this with @gkellogg :-)

Making this more explicit in the text sounds like a good idea to me. I guess a simple additional reference to raising an error instead of returning result could be put into item (7) of the algorithm. Possibly a reference to the error could also be put into the introductory text of the same section.

@jyasskin I addressed your concern in #189, please let us know if that is satisfactory.