xysun / blog

Mainly my paper reading notes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Eval.map, stack safety, Trampoline

xysun opened this issue Β· comments

This note is about my understanding for the Eval type in cats, specifically, the fact that its Map and flatMap are so called "stack free", which is driven by a data structure called Trampoline.


Start with classical factorial example:

def factorial(n:Int):Int =
    if (n == 1) 1
    else n * factorial(n-1)

Code above is not stack safe. To calculate factorial(5), at one moment there will be 5 functions on the stack, each waiting for the next to complete.

factorial(5) = 5 * factorial(4)
factorial(4) = 4 * factorial(3)
factorial(3) = 4 * factorial(2)
factorial(2) = 4 * factorial(1)
factorial(1) = 1

You can easily blow up stack by calling factorial(10000) -- stack will have to hold 10000 functions at the same time.

We know we can easily fix this by turning it into a tail recursive function:

def factorial(n:Int, acc:Int):Int =
    if (n == 1) acc
    else factorial(n-1, acc*n)

With this function, at any time, no matter how big the initial input is, there is only one function on stack.

when you call factorial(5, 1), this function is created and placed on stack first. However, it quickly gets replaced by factorial(4, 5), then it gets replaced again by factorial(3, 20), and so on.

Many recursive functions can be re-written in a tail recursive manner, but it's tedious!

Eval.map or Eval.flatMap is an abstraction for writing tail recursive, stack safe functions.

Use Eval.Now as base case, then replace any non tail-recursive operation by calling .map, then wrap in Eval.Defer

def factorial(n:Int):Int = // change return type to Eval[Int]
    if (n == 1) 1 // => Eval.Now(1)
    else n * factorial(n-1) // => Eval.Defer(factorial(n-1).map(_ * n))

To get the actual result, call .value on the returned Eval[Int], in cats it's implemented by a evaluate[A](v:Eval[A]) function. The real trick is this function is implemented tail recursively (so each evaluate() call gets replaced by a new evaluate() call), so you get tail recursion for free, without thinking how to turn it to tail recursion.

This is not something special about Eval though. Behind the curtain, Eval implements an internal data structure called Trampoline, which is defined as below:

sealed trait Trampoline[A]
case class Done[A](value:A) extends Trampoline[A]
case class More[A](thunk: () => A) extends Trampoline[A]
def evaluate[A](t:Trampoline[A]):A = t match {
	case Done(v) => v
	case More(t) => evaluate(t()) // tail recursive

Another nice thing is Eval is by itself a monad, meaning you can use it in for comprehensions:

for {
	x1 <- Eval.defer(f1)
	x2 <- Eval.defer(f2)
	...
} yield {
	doSomething(x1, x2, ...)
}

Now this for comprehension is also stack safe: it will only keep one function on the stack, instead of all the fs.

References

commented

Here's a working example for lazy noobs like me:

import scala.math.BigInt
// import $ivy.`org.typelevel::cats-core:1.1.0`
import cats._, cats.data._, cats.implicits._

def factorial(n : BigInt) : Eval[BigInt] = 
    if (n == 1) Eval.now(1)
    else Eval.defer(factorial(n-1).map(_ * n)) 

factorial(10000).value

Hi Pavel! πŸ‘‹

commented

Hi Joy! πŸ‘‹πŸ˜ƒ

It's worth noting as the article you referenced mentions, trampolining unfortunately isn't as cost-free as the while loops that scalac can compile from plain tail recursion: it trades stack frames for heap space.

That obviously goes a long way and is easy to accept for many situations, but there has been a lot of buzz around the pure FP Scala world lately about the performance costs of the "rub some Free Monads on all the things!" craze where entire large programs and all their data flow may be done this way… :neckbeard:

By the way, I've just been going through the Scala with Cats book that Underscore made free, and it's pretty good and approachable.

πŸ‘‹

Hi Ches! πŸ‘‹
Well guess which book I'm reading that triggered this post :)

🐱 🐈 😸