hedgehogqa / haskell-hedgehog

Release with confidence, state-of-the-art property testing for Haskell.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How are `StateT s Gen a` and `GenT (State s) a` supposed to be used?

ChickenProp opened this issue · comments

I have a test pattern that looks like: when I generate objects, put them in state; then when I generate other objects that reference them, I can look them up. So if a File references a Folder, and I want to create a File, I have two options:

  • Generate a Folder explicitly and pass it to the File generator
  • Look up the list of Folders in state and either pick one of them or generate a new one using some default generator

This makes it easy to say "give me three different Files, which may or may not be in the same Folder".

But, it seems that shrinking doesn't work like I'd hoped. Here's a simple demonstration:

numGen1 :: StateT [Int] Gen (Int, [Int])
numGen1 = do
  a <- do
    num <- Gen.int (Range.constant 0 10)
    State.modify (num :)
    pure num
  b <- do
    num <- Gen.int (Range.constant 0 10)
    State.modify (num :)
    pure num

  numsState <- State.get
  (, [b, a]) <$> Gen.element numsState

numGen2 :: StateT [Int] Gen (Int, [Int])
numGen2 = do
  numsGen <- Gen.list (Range.constant 2 2) $ do
    num <- Gen.int (Range.constant 0 10)
    State.modify (num :)
    pure num

  numsState <- State.get
  (, numsGen) <$> Gen.element numsState

I'd hope these would be basically the same: generate two numbers, save them in state, then pick one of the numbers that was saved. We have runTateT numGen_ [] :: Gen ((Int, [Int]), [Int])) and for every value generated, the two lists should be equal and the single value should be contained in them.

This property holds for the immediately generated values, and for shrunk values from numGen1:

ghci> Gen.printTree $ runStateT numGen1 []
((7,[6,7]),[6,7])
 ├╼((0,[6,0]),[6,0])
 │  ├╼((0,[0,0]),[0,0])
 │  │  └╼((0,[0,0]),[0,0])
 │  ├╼((0,[3,0]),[3,0])
 │  │  ├╼((0,[2,0]),[2,0])
...

But it fails for shrunk values from numGen2:

ghci> Gen.printTree $ runStateT numGen2 []
((10,[10,1]),[10,1])
 ├╼((0,[10,0]),[0])
 │  ├╼((0,[0,0]),[0,1])
 │  ├╼((5,[5,0]),[5,1])
 │  │  ├╼((3,[3,0]),[3,1])
 │  │  │  └╼((2,[2,0]),[2,1])
 │  │  │     └╼((1,[1,0]),[1,1])
...

where the two lists are different, though somehow the single value is still contained in both of them.

I think the culprit is that Gen.list does something complicated:

list :: MonadGen m => Range Int -> m a -> m [a]
list range gen =
  let
     interleave =
       (interleaveTreeT . nodeValue =<<)
  in
    sized $ \size ->
      ensure (atLeast $ Range.lowerBound size range) .
      withGenT (mapGenT (TreeT . interleave . runTreeT)) $ do
        n <- integral_ range
        replicateM n (toTreeMaybeT gen)

interleaveTreeT :: Monad m => [TreeT m a] -> m (NodeT m [a])
interleaveTreeT =
  fmap Tree.interleave . traverse runTreeT

it's not clear to me why it does this, but the term "interleave" makes me think it's about rearranging the shrink tree, where the default behavior would only shrink one element at a time?

(If we replace Gen.list (Range.constant 2 2) with replicateM 2, then numGen2 behaves like numGen1.)

So I guess I'm asking if this kind of thing is expected behavior for StateT s Gen a; and is there a way to do the kind of thing I'm trying to do without avoiding Gen.list entirely?

I've wondered about using GenT (State s) a instead, but I don't know how that would work. There's hoist to turn it into a Gen a, and hoist (Identity . flip evalState []) typechecks. Does it work? It's not obviously wrong, I can simply change the type of numGen2 to GenT (State [Int]) (Int, [Int]) and do

ghci> Gen.printTree $ hoist (Identity . flip evalState []) numGen2'
(4,[9,4])
 ├╼(0,[0,4])
 │  ├╼(0,[0,0])
 │  ├╼(2,[0,2])
 │  │  └╼(1,[0,1])
 │  └╼(3,[0,3])
...

...but I've lost access to the state variables here, so I can't tell what's going on with that, and passing in [] feels like I might be losing state somewhere? But I don't know. Similarly I could use forAllT, but then I'd need to turn a PropertyT (State s) into a PropertyT IO, which feels like it would have the same problem with [].

So, will that do what I want? I'm not really sure how to investigate further other than "try it and hope I don't run into errors that I don't understand".

Ah, no, it doesn't work. I can return numsState directly and I get

numGen2' :: GenT (State [Int]) (Int, [Int], [Int])
numGen2' = do
  numsGen <- Gen.list (Range.constant 2 2) $ do
    num <- Gen.int (Range.constant 0 10)
    State.modify (num :)
    pure num

  numsState <- State.get
  (, reverse numsGen, numsState) <$> Gen.element numsState

ghci> Gen.printTree $ hoist (Identity . flip State.evalState []) numGen2'
(9,[7,9],[7,9])
 ├╼(0,[7,0],[0])
 │  ├╼(0,[0,0],[0])
 │  ├╼(4,[4,0],[4])
 │  │  ├╼(2,[2,0],[2])
 │  │  │  └╼(1,[1,0],[1])
 │  │  └╼(3,[3,0],[3])
...

which looks interestingly different from the results from numGen2, but still not what I'm hoping for.