rust-ndarray / ndarray

ndarray: an N-dimensional array with array views, multidimensional slicing, and efficient operations

Home Page:https://docs.rs/ndarray/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Stack Overflows on Zip indexed par_for_each

marstaik opened this issue · comments

Hi, I am running a bunch of these various various tasks in a rayon threadpool. At some point a stack overflow in a thread becomes inevitable.

pub fn make_heightmap_test<T: Density + Send>(
    grid: &mut ArrayViewMut3<T>,
) {
    Zip::indexed(grid).par_for_each(|coords, value| {
        // stuff ...
    });
}

The last readable position I can get is deep in rayon, but in ndarray code

zip/mod.rs

    /// Return an *approximation* to the max stride axis; if
    /// component arrays disagree, there may be no choice better than the
    /// others.
    fn max_stride_axis(&self) -> Axis {
        let i = if self.prefer_f() {
            self
                .dimension
                .slice()
                .iter()
                .rposition(|&len| len > 1)
                .unwrap_or(self.dimension.ndim() - 1)
        } else {
            /* corder or default */
            self
                .dimension
                .slice()
                .iter()
                .position(|&len| len > 1)
                .unwrap_or(0)
        };
        Axis(i)
    }
}

Attached is a call stack.
I can reproduce this quite consistently, but I am unsure of how to provide a dump in windows for rust.

call_stack.txt

Please let me know if I can provide more data.

Work stealing can generally lead to large stack depths. Is this for release builds (which use significantly less stack due to inlining and better space reuse)? Did you try to just increase the stack size of the worker threads using ThreadPoolBuilder::stack_size? This is necessary often enough just due to how Rayon's scheduler works.

This was indeed on a release build. Setting the stack size to 24 megabytes for fun seems to have fixed it. Is there documentation on what the default stack size is? Or is it machine dependent? I haven't been able to find it just searching around.

AFAIK it is OS-dependant and I think on contemporary Linux, it is 8 MB for the main thread, c.f. https://unix.stackexchange.com/questions/127602/default-stack-size-for-pthreads

The main issue is that for the threads making up Rayon's thread pool, it is the much smaller default of 2 MB (which is controlled by Rust's std::thread module, c.f. https://doc.rust-lang.org/stable/std/thread/index.html#stack-size, i.e. it is much easier to hit this with Rayon than without it (both due to work stealing increasing usage and a smaller limit to start with).