Simpler ParFuture

Question

Simpler ParFuture

basvandijk opened this issue 12 years ago · comments

The current ParFuture type class is parameterized with both the monad m and the future. Since the only operation that can be applied to a future is get why not perform that operation inside spawn so that the user doesn't need to do it and can't accidentally put the future twice. This also eliminates the requirement for MultiParamTypeClasses and FunctionalDependencies for this type class:

class Monad m => Spawn m where
  -- | Create a potentially-parallel computation, and return a /future/
  -- (or /promise/) computation that can be used to query the result of the forked
  -- computation.  
  --
  -- >  spawn p = do
  -- >    r <- new
  -- >    fork (p >>= put r)
  -- >    return (get r)
  --
  spawn  :: NFData a => m a -> m (m a)

  -- | Like 'spawn', but the result is only head-strict, not fully-strict.
  spawn_ :: m a -> m (m a)

(Note I also use this interface in my threads package.)

Adam C. Foltzer · Answer 1 · Sun May 06 2012 05:23:27 GMT+0800 (China Standard Time)

Hi Bas,

Thanks for the suggestion! I am perhaps misunderstanding, however: since get will block on an empty future, it seems this definition would make spawn block until the child computation is finished, allowing for no parallelism. I hope I'm wrong: we'd love to make this a much simpler class!

Bas van Dijk · Answer 2 · Sun May 06 2012 06:48:38 GMT+0800 (China Standard Time)

however: since get will block on an empty future, it seems this definition would make spawn block until the child computation is finished, allowing for no parallelism.

No, spawn returns a computation that when executed will block until the forked thread finishes. You would use it as:

do wait <- spawn someExpensiveComputation
   doSomethingElse
   x <- wait

Adam C. Foltzer · Answer 3 · Sun May 06 2012 06:52:54 GMT+0800 (China Standard Time)

Ah, I missed the implications of the changed type. This looks quite promising; my worry is whether we'll be able to make it work with the sparks-based scheduler. We'll have a look this week!

Simon Marlow · Answer 4 · Tue May 08 2012 16:10:23 GMT+0800 (China Standard Time)

One disadvantage of this approach is that IVar supports Eq (and could conceivably support Ord), whereas you can't do much with values of type (m a).

Ryan Newton · Answer 5 · Mon May 14 2012 23:57:39 GMT+0800 (China Standard Time)

This is neat.

We have been thinking of ParFuture as a restricted version of ParIVar, where we spend most of our time. That futures by themselves are obviously an important programming model.

I don't see that the above proposal would break anything regarding Par implementations inhabiting both ParFuture, ParIVar, etc. But, I do worry about whether there would be any overheads to constructing extra (m a) objects, or simply confounding the optimizer. We especially want to retain good performance of ParFuture under Control.Monad.Par.Scheds.Sparks.
It would be easy enough to test simply by writing the above spawn as a wrapper around the current version. I don't think this is any more inefficient than what we would do.

An alternative that we briefly considered was to instead use an associated type with the Par* type classes to get rid of the functional dependency. We mainly backed down from that for an incidental reason -- it would break newtype deriving, which we use heavily in meta-par. Still, it would provide a cleaner interface from the users perspective.

P.S. By multiple put error you mean in the case where the 'm' is also in class ParIVar? For the futures only case, using ParFuture alone, there should be no multiple-put opportunity.