simonmar / monad-par

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

fork is too strict

simonmar opened this issue · comments

This simple program does not run in parallel:

import Control.Monad.Par

fib :: Integer -> Integer
fib 0 = 1
fib 1 = 1
fib n = fib (n-1) + fib (n-2)

main =
  args <- getArgs
  let [n,m] = map read args
  print $ runPar $ do
    i <- new
    j <- new
    fork (put i (fib n))
    fork (put j (fib m))
    a <- get i
    b <- get j
    return (a+b)

It isn't the old problem of fib not allocating: note that I used the Integer version that allocates.

No, the problem is that in the direct scheduler, fork is too strict: the work item is evaluated when put into the work pool (by pushL, which causes the fib call to be evaluated in the parent. It doesn't happen with spawnP, but I think that's only because spawnP isn't inlined enough to expose the strictness.

I've no problem with put being strict, but I thing fork being strict will cause confusion.

Maybe strictly speaking the inputs should be passed via IVars. That's defensible, but I think it would be nice if we didn't have to do that all the time. Making simple examples like the above work properly will avoid confusing users (it confused me, and I supposedly know what I'm doing :-).

I think there are basically two problems:

As you noted, the deque we use is strict, which forces IO actions to WHNF. That's a bit peculiar as it is, and I don't like it. Furthermore, said deque is intended only as a reference implementation. It's based on Data.Sequence, which is definitely overkill for this job: we don't need to be able to index into the deque or split it. Indeed, we only need an input-restricted deque: we never use pushR.

The other problem is that put and newFull perform the deepseq on the wrong side. It would seem more reasonable to use something more like this:

-- A version of put_ that doesn't force at all
put__ :: ...

-- This gets eta-expanded automatically
put_ iv !a = put__ iv a

-- We need to help GHC a bit with this one
put iv a = liftIO (a `deepseq` return ()) >> put__ iv a