Possible ambiguity in documentation of `xxx_setup()` functions

Question

Possible ambiguity in documentation of `xxx_setup()` functions

mpg opened this issue 2 years ago · comments

Manuel Pégourié-Gonnard commented 2 years ago

The draft PAKE extension, in the documentation of psa_pake_setup(), states:

If an error occurs at any step after a call to psa_pake_setup(), the operation will need to be reset by a call to psa_pake_abort().

This seems to leave some ambiguity as to whether the user needs to call psa_pake_abort() when psa_pake_setup() returns an error, as there are two reasonable interpretation of that sentence:

The call to psa_pake_setup() is not a "step after a call to psa_pake_setup()", so is not covered by that statement, so the user might not be required to call psa_pake_abort() when psa_pake_setup() fails.
If psa_pake_setup() returns an error, that's an event that happens after psa_pake_setup() was called, so that statement applies to it and the user is required to call psa_pake_abort().

Actually, all xxx_setup() functions seem to follow this pattern.

Note: @gilles-peskine-arm points out that general conventions so if xxx_setup() failed it leave the operation in an undefined state, so clearly if you want to re-use that operation object you need to call xxx_abort() on it, in order to bring it back to a defined state first - but what if you don't want to re-use it? Do you still need to call xxx_abort() on it (under possible pain of leaking resources or leaving your client-server setup in an inconsistent state)?

Gilles Peskine · Answer 1 · Fri Dec 09 2022 19:04:41 GMT+0800 (China Standard Time)

what if you don't want to re-use it? Do you still need to call xxx_abort() on it (under possible pain of leaking resources or leaving your client-server setup in an inconsistent state)?

Yes to both. setup() isn't required to clean up after itself, e.g. because the implementation bails out on the first problem to minimize code size, or because it's a client-server implementation and the connection to the server was lost. So if the application doesn't call abort(), on some reasonable implementations, there can be resource leaks.

Gilles Peskine · Answer 2 · Fri Dec 09 2022 19:08:04 GMT+0800 (China Standard Time)

The wording is slightly different in psa_pake_setup and the earlier setup functions, but even with the earlier setup functions, what happens after a failure in setup is unclear and can only be determined by looking at the general conventions.

For example psa_hash_setup:

If an error occurs at any step after a call to psa_hash_setup, the operation will need to be reset by a call to psa_hash_abort.

The interpretation that “after a call” means “after a call is started” rather than “after a call completes”, and thus includes errors happening inside the call itself, is too subtle for comfort.

Andrew Thoelke · Answer 3 · Fri Dec 09 2022 22:03:12 GMT+0800 (China Standard Time)

I think that the ambiguity in the wording within the setup function comes from the use of the word 'step':

If understood as 'point' then an error 'at any point' after the client has called the psa_xxx_setup() function clearly includes an error response from the setup function itself.
If understood in the context of the 'sequence of steps to carry out a multi-part operation', then the setup 'step' is not after the setup 'step'.
For key derivation and PAKE operations, 'step' could also be associated (incorrectly) with 'input steps' or 'PAKE steps'. But the ambiguous wording also applies to any of the multi-part operation functions that are called, not just those that take a 'step' parameter.

Specification

With regards to the specification, §3.3.2 Multi-part operations is probably definitive on this subject, rather than the general conventions. It states:

Setup: Start a new multi-part operation on an inactive operation object. Each operation object will define one or more setup functions to start a specific operation.

On success, a setup function will put an operation object into an active state. On failure, the operation object will remain inactive.

Additionally, for every multi-part operation function that requires the caller to invoke the associated psa_xxx_abort() function in the case of an error response, this is explicitly and clearly documented in the description of the API. For example, on psa_hash_update():

If this function returns an error status, the operation enters an error state and must be aborted by calling psa_hash_abort().

Programmer expectations

In relation to programmer intuition: the setup function is like a "resource acquisition" action, or a non-trivial object constructor. When such a function fails, the normal developer expectation is that no resources are allocated, not that the caller must call a "resource release"/finalizer function.

Resolution

My preference is to follow the intuitive 'setup is atomic' model, described by §3.3.2, which also ensures the best compatibility with applications that have assumed this model.

In any case, there are some clarifications to the specification that we should make:

Improve the wording in all of the setup functions to be clear what must be done in the case of an error response.
The §5.3.2 Behavior on error needs some revision:
- There are no psa_abort_xxx() functions.
- A PSA_ERROR_BAD_STATE, or any other error, for multi-part functions might leave the operation object in an "error state", but not what is often understood by the term "undefined state".
- There could be a reference to the multi-part operation lifecycle.
And maybe we do need a general multi-part operation state model diagram in §3.3.2?

Gilles Peskine · Answer 4 · Tue Dec 13 2022 18:31:56 GMT+0800 (China Standard Time)

Thanks for reminding us of the multi-part operations section. I agree that this unambiguously means that setup functions must clean up on error. It's an added burden on implementations compared to my earlier mental representation, but a small one.