oscourse-tsinghua / rcore_plus

Rust version of THU uCore OS. Linux compatible.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fix sys_futex race condition

wangrunji0408 opened this issue · comments

According to the specification:

FUTEX_WAIT (since Linux 2.6.0)
This operation tests that the value at the futex word pointed
to by the address uaddr still contains the expected value val,
and if so, then sleeps waiting for a FUTEX_WAKE operation on
the futex word. The load of the value of the futex word is an
atomic memory access (i.e., using atomic machine instructions
of the respective architecture). This load, the comparison
with the expected value, and starting to sleep are performed
atomically and totally ordered with respect to other futex
operations on the same futex word.
If the thread starts to
sleep, it is considered a waiter on this futex word. If the
futex value does not match val, then the call fails
immediately with the error EAGAIN.

Now the load and sleep is not an atomic operation on futex.

So if events happen as the following:
Thread A wants to wait futex. Thread B wants to wake futex.

  1. Thread A loads the word at kernel space.
  2. Thread B stores the word at user space.
  3. Thread B wakes up the queue at kernel space.
  4. Thread A sleeps at the queue at kernel space.

Then thread A will sleep forever. GG.

commented

The current Condvar facility is incomplete and prone to racing. We need to rethink how Condvar works and how sys_{poll, select} uses it.