Algorithm interface vs DoS defense
jyasskin opened this issue · comments
https://www.w3.org/TR/rdf-canon/#canon-algorithm says
The canonicalization algorithm converts an input dataset into a canonicalized dataset.
However, https://www.w3.org/TR/rdf-canon/#canon-algo-algo says
Implementations MUST prevent against potential denial-of-service attacks.
The interface claims that they're still going to produce a canonicalized dataset after detecting a denial-of-service attack, but the algorithm doesn't say anything about how they're supposed to do that. I suspect the interface should change to say that the algorithm returns either a canonicalized dataset or a failure.
Hi @jyasskin. The language about denial of service attacks is related to the potential for providing a poisoned dataset, and that implementations must guard against this. Possible ways to mitigate this are described in 7.1 Dataset Poisoning, but these are not exhaustive and only suggest ways that might be used to detect an attack.
Note that we were careful to not define an API for doing canonicalization, nor to specify how errors are handled; this is considered out of scope for this particular spec, but I would expect that other specifications leveraging this spec would describe language appropriate for their situation. If we were to be more API-centric, we might say something about rejecting a promise or raising an error, which would short-circuit the actual return from the algorithm.
My own implementation raises an exception when such a dataset is detected.
Adding "or raises an error" to the interface definition would make that clear. It's not about defining an API for a programming language, but rather about defining an interface that other specifications can use.
My own implementation raises an exception when such a dataset is detected.
FWIW, this is exactly what my implementation does as well... (and I did not sync this with @gkellogg :-)
Making this more explicit in the text sounds like a good idea to me. I guess a simple additional reference to raising an error instead of returning result could be put into item (7) of the algorithm. Possibly a reference to the error could also be put into the introductory text of the same section.