combust / mleap

MLeap: Deploy ML Pipelines to Production

Home Page:https://combust.github.io/mleap-docs/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

python SimpleSparkSerializer support protobuf as serialize format

austinzh opened this issue · comments

SimpleSparkSerializer use the default json format.
But for model like Word2Vec, size can be huge and json decode with current design will cause heap OOM.
To ease this problem, protobuf format will provide better model size and read/write performance with less memory consumption.