Mbed-TLS / mbedtls

An open source, portable, easy to use, readable and flexible TLS library, and reference implementation of the PSA Cryptography API. Releases are on a varying cadence, typically around 3 - 6 months between releases.

Home Page:https://www.trustedfirmware.org/projects/mbed-tls/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The PSA code is not thread-safe

gilles-peskine-arm opened this issue · comments

Description

The PSA Cryptography API specification defines who is responsible for managing concurrency in calls to the PSA Cryptography API, between the applications and the implementation.

In a nutshell, it's up to the application to not use operation objects concurrently, and it's up to the implementation to allow concurrent use of the key store.

Mbed Crypto currently does not have any protection against concurrent use of the key store, so it cannot be used in a multithreaded application.

As a first step, the goal of this issue is to comply with the API specification and nothing more. Just support API calls that access keys from concurrent threads. Protect the key store with a lock. Take the lock in any function that accesses the key store (in psa_get_key_slot), and add a release function. All API functions must call the release function before returning.

This means that we do I/O to store and load persistent keys, and wait for a response from a secure element or hardware accelerator, with a lock held. This isn't ideal, but can be fixed later.

Note that to make the code fully thread-safe, RNG access must be protected, not just key access. This is tracked in #3391. RNG queries (not initialization or explicit reseeding, but including automatic reseeding) are thread-safe when using the built-in PRNG, but not when using MBEDTLS_PSA_CRYPTO_EXTERNAL_RNG.

Issue request type

[ ] Question
[ ] Enhancement
[x] Bug

@gilles-peskine-arm - could you please point me to current equivalent of https://armmbed.github.io/mbed-crypto/html/general.html#concurrent-calls, and/or any relevant documentation on this issue?

@AndrzejKurek The section has been moved to https://armmbed.github.io/mbed-crypto/html/overview/conventions.html#concurrency.

Every function that accesses a key slot already calls psa_get_and_lock_key_slot_in_memory (or more commonly one of its wrapper functions) before and psa_unlock_key_slot after. So the gist of this task is to add a mutex to these functions, and make sure the library compiles both with and without mutex support.

We should also validate that mutexes are used correctly (e.g. no double lock, no unlock on a non-locked mutex, no mutex left unlocked); I think that with MBEDTLS_TEST_HOOKS turned on, the mbedtls_test_mutex_usage_xxx instrumentation is enough.

Testing concurrent behavior is out of scope of this task. We don't have the prerequisites to write portable concurrent applications (our platform abstraction for threading doesn't include thread management, only concurrency management between externally created threads).

Should this replace the key slot locking mechanism from psa_lock_key_slot and psa_unlock_key_slot?
These functions let users lock a slot multiple times, and it will not be possible anymore with a mutex in place.

Ah, sorry, I'd forgotten about that. The lock_count field includes two cases: API functions currently executing, and a key that's been opened by psa_open_key (and not yet closed by psa_close_key). API functions currently executing is an exclusive case (at least for now — we could allow multiple functions to read from a slot as long as nobody's writing, but let's start simple). Having a key open is not an exclusive case.

So the behavior should be:

  • Rename lock_count to open_count, I guess
  • psa_open_key: lock, increase open_count, release
  • psa_close_key: lock, decrease open_count, destroy if open_count==0` otherwise release

Please take a look at this PoC: AndrzejKurek@f6ad162
The downside that I can see here is that after calling psa_start_key_creation the mutex will be locked until the slot is unlocked.

@AndrzejKurek That's a start, but there are more places to protect. At least psa_get_empty_key_slot is accessing key slots without locking them: two threads running this function concurrently could end up deciding to pick the same slot.

One way to resolve this would be to have a lock for the keystore as a whole, in addition to the lock for each key; you'd need to take the lock only for a short time, around any code that might possibly access a slot that isn't locked yet (with this approach, take care with the order of lock/unlock operations to avoid a deadlock). Generally every place that iterates over all the slots would take this lock, and in addition code that changes some “global” properties of a slot (at least the outcome of psa_is_key_slot_occupied) would need to have this global lock.

It's expected that psa_start_key_creation acquires the mutex for the slot it picks, and that it's released in psa_{finish,fail}_key_creation.

@AndrzejKurek That's a start, but there are more places to protect. At least psa_get_empty_key_slot is accessing key slots without locking them: two threads running this function concurrently could end up deciding to pick the same slot.

One way to resolve this would be to have a lock for the keystore as a whole, in addition to the lock for each key; you'd need to take the lock only for a short time, around any code that might possibly access a slot that isn't locked yet (with this approach, take care with the order of lock/unlock operations to avoid a deadlock). Generally every place that iterates over all the slots would take this lock, and in addition code that changes some “global” properties of a slot (at least the outcome of psa_is_key_slot_occupied) would need to have this global lock.

It's expected that psa_start_key_creation acquires the mutex for the slot it picks, and that it's released in psa_{finish,fail}_key_creation.

Great, thanks for the info. Just wanted to make sure we're aligned with expectations. I'll update this issue with a PR link once it's up.

Draft PR created here:
#5084

Hello!
Just as a quick heads-up, @ema and I ran into this issue last night whilst packaging up the Rust crate psa-crypto in Debian. In the autopackage tests: "cargo test" is executed. When cargo test is executed, it is by default multi-threaded. We had some weird results from the unit tests as the PSA crypto API was being called from multiple threads at once.

We've got a workaround in place to do the following now in the Debian package tests:
cargo test -- --test-threads=1

We've got a workaround in place to do the following now in the Debian package tests: cargo test -- --test-threads=1

Yeah took that from rust-psa-crypto's CI script, which runs the tests with one thread only probably because of this very issue? https://github.com/parallaxsecond/rust-psa-crypto/blob/main/ci.sh#L39

Can you please update if the fix for this is planned ? If yes, by when can we expect it in the repo ?

We plan to introduce PSA API's for our crypto accelerators and the applications using these need the implementation to be thread safe.

@ruchi393 We'd like to fix this soon. We're currently looking at our planning for Jul-Sep and trying to see if this will fit.

The roadmap has just been updated. We can't commit to September, but we're aiming to have this in the development branch by the end of the year, to be released in early 2024.

Closing as completed by Threading MVP Epic