rayon-rs / rayon

Rayon: A data parallelism library for Rust

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Yet more confusion about panic_handlers

cmyr opened this issue · comments

I am using rayon's threadpool to execute a series of tasks. In the case that a task panics, I would like to set some global flag, so that I can stop adding new work and terminate. I would assume I could do this with a custom panic handler, but I've been poking at things for several hours without success.

A simple version of my program would look like,

use rayon::ThreadPoolBuilder;

fn main() {
    let threads = ThreadPoolBuilder::new()
        .thread_name(|n| format!("pool-{n}"))
        .panic_handler(|_oh_no| {
            eprintln!(
                "handled panic in thread {}",
                std::thread::current().name().unwrap_or("wat")
            );
        })
        .start_handler(|n| eprintln!("starting thread {n}"))
        .exit_handler(|n| eprintln!("exiting thread {n}"))
        .num_threads(4)
        .build()
        .expect("failed to build threadpool");

    threads.scope(|scope| {
        for i in 0..30 {
            scope.spawn(move |_| {
                eprintln!(
                    "{i} sleeping in thread {}",
                    std::thread::current().name().unwrap_or("")
                );
                std::thread::sleep_ms(i * 50);
                eprintln!("{i} woke up");
                if i == 22 {
                    panic!("oh no");
                }
            })
        }
    })
}

Should I expect this to work? Do I need to just use catch_unwind directly, inside each call I make to spawn?

The pool's panic_handler is only used in places where we cannot otherwise propagate a panic back to the user, particularly from unscoped spawns. With a scope, we do propagate panics where the scope would otherwise return, but only after all its spawns have executed.

You might like panic_fuse -- e.g. your example code would be (0..30).into_par_iter().panic_fuse().for_each(|i| ...). Under the hood, that is just using an atomic flag to skip further work after a panic. That behavior is definitely reproducible on your own if panic_fuse doesn't fit the way you're executing.

Also, caveats around catch_unwind remain -- none of this applies if the build is using panic=abort.

Okay, thanks! I ended up implementing something that looks like your explanation of panic_fuse, where I check an atomic book from within the spawned task and return if it's been set.