nix-rust / nix

Rust friendly bindings to *nix APIs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Debug the `test_signal::test_sigsuspend()` test failure in macOS CI

SteveLauC opened this issue · comments

CI log: https://github.com/nix-rust/nix/actions/runs/8306056722/job/22733675051

failures:

---- sys::test_signal::test_sigsuspend stdout ----
thread '<unnamed>' panicked at 'assertion failed: SIGNAL_RECIEVED.load(Ordering::SeqCst)', test/sys/test_signal.rs:386:9
thread 'sys::test_signal::test_sigsuspend' panicked at 'called `Result::unwrap()` on an `Err` value: Any { .. }', test/sys/test_signal.rs:393:6


failures:
    sys::test_signal::test_sigsuspend

I can probably do this when I receive my macOS machine.

I cannot reproduce this on my m1 mac:

$ uname -a
Darwin Steves-MacBook-Air.local 22.6.0 Darwin Kernel Version 22.6.0: Wed Jul  5 22:22:52 PDT 2023; root:xnu-8796.141.3~6/RELEASE_ARM64_T8103 arm64

Have no idea why that bool value has not been updated (by the signal handler) to true when sigsuspend() returns. On Linux, it is explicitly stated that when sigsuspend() returns, the signal handler is guaranteed to be executed:

If the signal is caught, then sigsuspend() returns after the signal handler returns

The macOS manual does not mention this.

It also failed here: https://github.com/nix-rust/nix/actions/runs/8700433252/job/23860623409?pr=2374 on #2374

I think this test is ok, but sigprocmask races with ALL tests using thread signal masks, as on OSX, sigprocmask sets the mask of all threads.

if we don't want to serialize everythhing, we could add a RWLock (writer: sigprocmask, reader: all the tests that use thread masks)

EDIT: but this should trigger the assert!(!SIGNAL_RECIEIVED...) in line 384, not the one in 386

EDIT: wild guess based on https://github.com/apple-oss-distributions/xnu/blob/94d3b452840153a99b38a3a9659680b2a006908e/bsd/kern/kern_sig.c#L571 - i did not validate any if conditions and have no idea on how the kernel works.
sigprocmask calls something like block_procsigmask, this calls proc_signalend, and this calls wakeup, waking up the sigsuspend before executing the handler?

maybe #2375 fixes it, but without a reliable way to reproduce it, its hard to tell