taskSupport does not propagate to result collections
mernst-github opened this issue · comments
Operations on parallel collections do not propagate the taskSupport to the result.
This seems highly unintuitive.
For example, a fellow engineer recently changed code from
par.map(expensiveProcessing)
to
par.filter(isRelevant).map(expensiveProcessing)
losing the taskSupport on the way so that expensiveProcessing
ended up being executed on the global EC which is not intended. Three different engineers (author, code reviewer and me) did not expect this.
Is this on purpose or a bug?
Some research in the library code suggests that there is an intention to set a non-default taskSupport on the result:
-
Combiner has "resultWithTaskSupport" and propagates its
combinerTaskSupport
there. -
ParIterableLike#combinerFactory propagates the taskSupport to
combinerTaskSupport
, but only ifcombiner.canBeShared
(https://github.com/scala/scala-parallel-collections/blob/master/core/src/main/scala/scala/collection/parallel/ParIterableLike.scala#L545)
Not sure whether that makes sense or not, but my intuition would propagate in either case.
I agree this looks like an oversight, but I almost don't know the codebase nor am I a concurrency expert.
@axel22, do you have a moment to take a look?
I agree, propagating the TaskSupport looks like the way to go. I don't know what the intent behind canBeShared == false
is.
Thanks, @axel22
@mernst-github do you have time to prepare PRs to scala/scala (2.12.x) and this repo?
Happy to, just let me check the contrib formalities w/ my employer.
I have only made limited progress on this front, it might be easier for someone else to add the two assignments...
No prob, please take a look at #161
1.0.1 is en route to Maven Central