Algorithm interface vs DoS defense

Question

Algorithm interface vs DoS defense

jyasskin opened this issue 8 months ago · comments

Jeffrey Yasskin commented 8 months ago

https://www.w3.org/TR/rdf-canon/#canon-algorithm says

The canonicalization algorithm converts an input dataset into a canonicalized dataset.

However, https://www.w3.org/TR/rdf-canon/#canon-algo-algo says

Implementations MUST prevent against potential denial-of-service attacks.

The interface claims that they're still going to produce a canonicalized dataset after detecting a denial-of-service attack, but the algorithm doesn't say anything about how they're supposed to do that. I suspect the interface should change to say that the algorithm returns either a canonicalized dataset or a failure.

Gregg Kellogg · Answer 1 · Tue Nov 21 2023 07:03:20 GMT+0800 (China Standard Time)

Hi @jyasskin. The language about denial of service attacks is related to the potential for providing a poisoned dataset, and that implementations must guard against this. Possible ways to mitigate this are described in 7.1 Dataset Poisoning, but these are not exhaustive and only suggest ways that might be used to detect an attack.

Note that we were careful to not define an API for doing canonicalization, nor to specify how errors are handled; this is considered out of scope for this particular spec, but I would expect that other specifications leveraging this spec would describe language appropriate for their situation. If we were to be more API-centric, we might say something about rejecting a promise or raising an error, which would short-circuit the actual return from the algorithm.

My own implementation raises an exception when such a dataset is detected.

Jeffrey Yasskin · Answer 2 · Tue Nov 21 2023 07:12:00 GMT+0800 (China Standard Time)

Adding "or raises an error" to the interface definition would make that clear. It's not about defining an API for a programming language, but rather about defining an interface that other specifications can use.

Ivan Herman · Answer 3 · Tue Nov 21 2023 21:41:03 GMT+0800 (China Standard Time)

My own implementation raises an exception when such a dataset is detected.

FWIW, this is exactly what my implementation does as well... (and I did not sync this with @gkellogg :-)

Making this more explicit in the text sounds like a good idea to me. I guess a simple additional reference to raising an error instead of returning result could be put into item (7) of the algorithm. Possibly a reference to the error could also be put into the introductory text of the same section.

Gregg Kellogg · Answer 4 · Wed Nov 29 2023 05:32:37 GMT+0800 (China Standard Time)

@jyasskin I addressed your concern in #189, please let us know if that is satisfactory.