Providing cats.Defer instances for Decoder and Encoder to make working with recursive structures easier

Question

Providing cats.Defer instances for Decoder and Encoder to make working with recursive structures easier

morgen-peschke opened this issue a year ago · comments

Certain shapes of data are difficult to write correct Decoder/Encoder instances for, and providing a Defer instance for Decoder and Encoder can make this much simpler to do correctly.

Problem Statement

The easiest way to see this is working through an example. I'll use the same data structure and sample JSON for each example:

import cats.Show
import io.circe.Json
import io.circe.syntax._

sealed trait Tree[A]
object Tree {
  final case class Branch[A](left: Tree[A], right: Tree[A]) extends Tree[A]
  final case class Leaf[A](value: A) extends Tree[A]

  implicit def show[A]: Show[Tree[A]] = Show.fromToString

  val json: Json =
    Json.obj(
      "left" -> Json.obj(
        "left" -> Json.obj(
          "left" -> Json.obj("value" := 1),
          "right" -> Json.obj("value" := 2)
        ),
        "right" -> Json.obj("value" := 3)
      ),
      "right" -> Json.obj("value" := 4)
    )
}

Attempt 1: Write it by the book

A naive decoder, written using the examples from the ADT section of the guide doesn't work as expected. Because the implicit definition has to be a def, "tying the knot" doesn't work, and this produces a StackOverflowError at runtime:

implicit def branchDecoder[A: Decoder]: Decoder[Branch[A]] =
  Decoder.forProduct2[Branch[A], Tree[A], Tree[A]]("left", "right")(Branch.apply)

implicit def leafDecoder[A: Decoder]: Decoder[Leaf[A]] =
  Decoder[A].at("value").map(Leaf(_))

implicit def treeDecoder[A: Decoder]: Decoder[Tree[A]] =
  List[Decoder[Tree[A]]](
    Decoder[Branch[A]].widen,
    Decoder[Leaf[A]].widen
  ).reduce(_ or _)

println {
  try Decoder[Tree[Int]].tryDecodeAccumulating(Tree.json.hcursor).show
  catch {
    case e: StackOverflowError => e
  }
}

scastie

Attempt 2: Force it to be lazy

This can be worked around by manually short-circuiting the code that uses the implicit lookup, but it's kind of ugly and requires peeking behind the curtain of the cursor implementation:

implicit def branchDecoder[A: Decoder]: Decoder[Branch[A]] = 
  new Decoder[Branch[A]] {
    override def apply(c: HCursor): Result[Branch[A]] = decodeAccumulating(c).toEither.leftMap(_.head)
    private val nullDecoder = Decoder.failed[Tree[A]](DecodingFailure("Should not see this", Nil))
    override def decodeAccumulating(c: HCursor): AccumulatingResult[Branch[A]] = {
      (
        c.downField("left") match {
          case cursor: HCursor => Decoder[Tree[A]].tryDecodeAccumulating(cursor)
          case cursor => nullDecoder.tryDecodeAccumulating(cursor)
        },
        c.downField("right") match {
          case cursor: HCursor => Decoder[Tree[A]].tryDecodeAccumulating(cursor)
          case cursor => nullDecoder.tryDecodeAccumulating(cursor)
        }
      ).mapN(Branch.apply)
    }
  }

implicit def leafDecoder[A: Decoder]: Decoder[Leaf[A]] = 
  Decoder[A].at("value").map(Leaf(_))

implicit def treeDecoder[A: Decoder]: Decoder[Tree[A]] = 
  List[Decoder[Tree[A]]](
    Decoder[Branch[A]].widen,
    Decoder[Leaf[A]].widen
  ).reduce(_ combine _)

This does work, but it has rather nasty runtime behavior: it creates a number of decoders that scales on the number of nodes in Tree. Instrumenting this with counts of calls to branchDecoder, leafDecoder and treeDecoder reveal that decoding Tree.json builds 21 total decoders (7 of each type).

scastie

Attempt 3: Implicitly self-referential class

Wrapping the implementation of treeDecoder in another Decoder[Tree] that provides itself to the current implementation and delegates to the same solves the problem of excessive instantiations:

implicit def branchDecoder[A](implicit DTA: Decoder[Tree[A]]): Decoder[Branch[A]] = 
  Decoder.forProduct2[Branch[A], Tree[A], Tree[A]]("left", "right")(Branch.apply)

implicit def leafDecoder[A: Decoder]: Decoder[Leaf[A]] = 
  Decoder[A].at("value").map(Leaf(_))

implicit def treeDecoder[A: Decoder]: Decoder[Tree[A]] = 
  new Decoder[Tree[A]] {
    override def apply(c: HCursor): Result[Tree[A]] = decodeAccumulating(c).toEither.leftMap(_.head)
    private implicit val self: Decoder[Tree[A]] = this
    private val delegate =
      List[Decoder[Tree[A]]](
        Decoder[Branch[A]].widen,
        Decoder[Leaf[A]].widen
      ).reduce(_ combine _)
    override def decodeAccumulating(c: HCursor): AccumulatingResult[Tree[A]] = delegate.decodeAccumulating(c)
  }

scastie

Solution

As Defer abstracts this solution in a much cleaner package, providing instances for Decoder and Encoder would allow the working version we arrived at like this:

implicit def branchDecoder[A](implicit DTA: Decoder[Tree[A]]): Decoder[Branch[A]] = 
  Decoder.forProduct2[Branch[A], Tree[A], Tree[A]]("left", "right")(Branch.apply)

implicit def leafDecoder[A: Decoder]: Decoder[Leaf[A]] = 
  Decoder[A].at("value").map(Leaf(_))

implicit def treeDecoder[A: Decoder]: Decoder[Tree[A]] = 
  Defer[Decoder].fix { implicit recurse =>
    List[Decoder[Tree[A]]](
      Decoder[Branch[A]].widen,
      Decoder[Leaf[A]].widen
    ).reduce(_ combine _)
  }

Because Defer doesn't have great visibility, I'd recommend doing what cats-parse does, and provide Decoder.recursive to lead users to the correct implementation, which would look like this:

implicit def treeDecoder[A: Decoder]: Decoder[Tree[A]] = 
  Decoder.recursive[Tree[A]] { implicit recurse =>
    List[Decoder[Tree[A]]](
      Decoder[Branch[A]].widen,
      Decoder[Leaf[A]].widen
    ).reduce(_ combine _)
  }

Arman Bilge · Answer 1 · Sun Apr 02 2023 06:19:36 GMT+0800 (China Standard Time)

👍 linking to:

Aliaksei Kozich · Answer 2 · Thu Mar 07 2024 02:26:17 GMT+0800 (China Standard Time)

@morgen-peschke In your last code snippet

implicit def treeDecoder[A: Decoder]: Decoder[Tree[A]] = 
  Decoder.recursive[Tree[A]] { implicit recurse =>
    List[Decoder[Tree[A]]](
      Decoder[Branch[A]].widen,
      Decoder[Leaf[A]].widen
    ).reduce(_ combine _)
  }

why don't you have ambiguous implicit values if you have both method treeDecoder and recurse of the same type Decoder[Tree[A]]?

Morgen Peschke · Answer 3 · Tue Mar 26 2024 07:16:03 GMT+0800 (China Standard Time)

@morgen-peschke In your last code snippet
implicit def treeDecoder[A: Decoder]: Decoder[Tree[A]] = 
  Decoder.recursive[Tree[A]] { implicit recurse =>
    List[Decoder[Tree[A]]](
      Decoder[Branch[A]].widen,
      Decoder[Leaf[A]].widen
    ).reduce(_ combine _)
  }
why don't you have ambiguous implicit values if you have both method treeDecoder and recurse of the same type Decoder[Tree[A]]?

I'm not actually sure, but it doesn't seem to have trouble