rayon-rs / rayon

Rayon: A data parallelism library for Rust

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support for some sort of collect()/extend() into `LinkedList<Vec<T>>`

coolreader18 opened this issue · comments

commented

From what I can tell, building a LinkedList<Vec<T>> is an efficient (one of the more efficient?) ways of collecting a parallel iterator, which is why ParallelExtend for Vec uses it. However, if you don't need specifically a Vec the process of concatenating all the vecs might be unnecessary overhead. It'd be nice if ListVecConsumer was exposed in some way, so that users could collect like that. (Yes, you could just .fold(Vec::new, |vec, item| vec.push(item)), but that's an element-wise push that skips specialization of Extend for Vec)

Directly implementing FromParallelIterator<T> for LinkedList<Vec<T>> might break type inference, but there could be a collect_vec_list() method on ParallelIterator, maybe.

From what I can tell, building a LinkedList<Vec<T>> is an efficient (one of the more efficient?) ways of collecting a parallel iterator, which is why ParallelExtend for Vec uses it.

It's the most efficient way that we know, at least for unindexed iterators. For IndexedParallelIterator, we have a bit of pseudo-specialization (via opt_len) that lets it collect directly into the final pre-allocated Vec. We could still do that in your scenario as well, just slightly wastefully allocating a single LinkedList node to hold the entire Vec.

there could be a collect_vec_list() method on ParallelIterator, maybe.

I think that sounds reasonable!

I don't think we need an extend flavor though, because anyone can cheaply append a collection to an existing list.