simonmar / monad-par

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Simpler ParFuture

basvandijk opened this issue · comments

The current ParFuture type class is parameterized with both the monad m and the future. Since the only operation that can be applied to a future is get why not perform that operation inside spawn so that the user doesn't need to do it and can't accidentally put the future twice. This also eliminates the requirement for MultiParamTypeClasses and FunctionalDependencies for this type class:

class Monad m => Spawn m where
  -- | Create a potentially-parallel computation, and return a /future/
  -- (or /promise/) computation that can be used to query the result of the forked
  -- computation.  
  --
  -- >  spawn p = do
  -- >    r <- new
  -- >    fork (p >>= put r)
  -- >    return (get r)
  --
  spawn  :: NFData a => m a -> m (m a)

  -- | Like 'spawn', but the result is only head-strict, not fully-strict.
  spawn_ :: m a -> m (m a)

(Note I also use this interface in my threads package.)

Hi Bas,

Thanks for the suggestion! I am perhaps misunderstanding, however: since get will block on an empty future, it seems this definition would make spawn block until the child computation is finished, allowing for no parallelism. I hope I'm wrong: we'd love to make this a much simpler class!

however: since get will block on an empty future, it seems this definition would make spawn block until the child computation is finished, allowing for no parallelism.

No, spawn returns a computation that when executed will block until the forked thread finishes. You would use it as:

do wait <- spawn someExpensiveComputation
   doSomethingElse
   x <- wait

Ah, I missed the implications of the changed type. This looks quite promising; my worry is whether we'll be able to make it work with the sparks-based scheduler. We'll have a look this week!

One disadvantage of this approach is that IVar supports Eq (and could conceivably support Ord), whereas you can't do much with values of type (m a).

This is neat.

We have been thinking of ParFuture as a restricted version of ParIVar, where we spend most of our time. That futures by themselves are obviously an important programming model.

I don't see that the above proposal would break anything regarding Par implementations inhabiting both ParFuture, ParIVar, etc. But, I do worry about whether there would be any overheads to constructing extra (m a) objects, or simply confounding the optimizer. We especially want to retain good performance of ParFuture under Control.Monad.Par.Scheds.Sparks.
It would be easy enough to test simply by writing the above spawn as a wrapper around the current version. I don't think this is any more inefficient than what we would do.

An alternative that we briefly considered was to instead use an associated type with the Par* type classes to get rid of the functional dependency. We mainly backed down from that for an incidental reason -- it would break newtype deriving, which we use heavily in meta-par. Still, it would provide a cleaner interface from the users perspective.

P.S. By multiple put error you mean in the case where the 'm' is also in class ParIVar? For the futures only case, using ParFuture alone, there should be no multiple-put opportunity.