suzaku-io / boopickle

Binary serialization library for efficient network communication

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Serializable Pickler instances

zsolt-donca opened this issue · comments

It would be great if Pickler instances could be made Serializable. This would allow using boopickle as as a more efficient data serializer for cluster-computing frameworks such as Apache Spark of Apache Flink. In case of Flink (the one I use in particular), in the Scala API it allows for customization by relying on an implicit TypeInformation, but the implementation is required to be Serializable. I have a simple PoC with implicit typeclass derivation.

I tried forking boopickle and making the Pickler trait simply extend Serializable, but I only managed to make it work in the most simple cases. The issues I've encountered with:

  • the Pickler instances generated by the macro for ADTs sometimes captures the outer object (which is often not serializable when writing Flink jobs), even though there is no apparent reason to do so;
  • the Pickler instance for collections sometimes relies on scala.collection.generic.GenTraversableFactory, which contains references that are not serializable.

What are the original developers' thoughts? Could this be eventually achieved? I would gladly contribute if I could get some feedback or some basic guidance.