rayon-rs / rayon

Rayon: A data parallelism library for Rust

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`ParallelExtend` for tuples of references

xosmig opened this issue · comments

ParallelExtend implemented for tuples allows extending multiple collections from one ParallelIterator, which is really useful. This code works as expected:

let vec1 = vec![1., 2., 3.];
let vec2 = vec!["1".to_string(), "2".to_string(), "3".to_string()];

let input = vec![4, 5, 6];

let mut vecs = (vec1, vec2);
vecs.par_extend(input.into_par_iter().map(|i| (i as f64, i.to_string())));

assert_eq!(&vecs.0, &vec![1., 2., 3., 4., 5., 6.]);
assert_eq!(&vecs.1, &vec!["1", "2", "3", "4", "5", "6"]);

However, the way it is implemented now, the two collections actually have to be stored as a tuple. This code doesn't compile:

let mut vec1 = vec![1., 2., 3.];
let mut vec2 = vec!["1".to_string(), "2".to_string(), "3".to_string()];

let input = vec![4, 5, 6];

// Error: method not found in `(&mut Vec<{float}>, &mut Vec<String>)`
(&mut vec1, &mut vec2).par_extend(input.into_par_iter().map(|i| (i as f64, i.to_string())));

assert_eq!(&vec1, &vec![1., 2., 3., 4., 5., 6.]);
assert_eq!(&vec2, &vec!["1", "2", "3", "4", "5", "6"]);

This looks like a natural use-case and I encountered it at least a couple of times while using rayon. Am I missing some other tools to implement what I want? If not, would it be possible to add support for it? Perhaps, with a slightly different API as implementing ParallelExtend for (&mut FromA, &mut FromB) would probably conflict with the original implementation for (FromA, FromB).

implementing ParallelExtend for (&mut FromA, &mut FromB) would probably conflict with the original implementation for (FromA, FromB).

Yes, since &mut _ can be a FromA itself, and even implemented downstream since that's a fundamental type. It might have been nice to add a blanket impl for &mut T where T: ParallelExtend, but it's a breaking change to add that now. We possibly could add that for each upstream type we have implemented, like &mut Vec<T>, and that would make your example work.

(I feel like the blanket impl must have been discussed somewhere, but I can't find it...)

For now, as a workaround, I'm using a simple wrapper like this:

/// Adds method `by_ref()` to `ParallelExtend` that returns a wrapper around `&mut A`
/// that "inherits" the implementation of `ParallelExtend` from `A`.
pub trait ParExtendByRefTrait<T: Send>: ParallelExtend<T> {
    fn by_ref(&mut self) -> ExtendRef<'_, T, Self> {
        ExtendRef::new(self)
    }
}

impl<T: Send, A: ParallelExtend<T> + ?Sized> ParExtendByRefTrait<T> for A {}

/// A wrapper around `&mut A` that "inherits" the implementation of `ParallelExtend` from `A`.
///
/// The type parameter `T` is necessary for Rust to infer which implementation of `by_ref()` to use. 
pub struct ExtendRef<'a, T, A: ?Sized> {
    inner: &'a mut A,
    _phantom: PhantomData<T>,
}

impl<'a, T, A: ?Sized> ExtendRef<'a, T, A> {
    pub fn new(inner: &'a mut A) -> Self {
        Self {
            inner,
            _phantom: PhantomData,
        }
    }
}

impl<'a, T: Send, A: ParallelExtend<T> + ?Sized> ParallelExtend<T> for ExtendRef<'a, T, A> {
    fn par_extend<I>(&mut self, par_iter: I)
    where
        I: IntoParallelIterator<Item = T>,
    {
        self.inner.par_extend(par_iter)
    }
}

That can later used like this:

(vec1.by_ref(), vec2.by_ref()).par_extend(...)

Do you think it could make sense to add a similar by_ref() method to ParallelExtend?

I think by_ref() is too generic to put in our prelude, where everything else has some par in it, and this should also indicate that it's only for "extend" somehow. That loses some convenience of brevity though, I know.

Maybe the wrapper should just be a tuple struct, ExtendRef<'a, A>(pub &'a mut A)? Then you can also use it as a direct constructor, like (ExtendRef(&mut vec1), ExtendRef(&mut vec2)).par_extend(...).

Or if this is really only useful for tuples, then something like ExtendTuple(&mut vec1, &mut vec2).