joschu / cgt

Computation Graph Toolkit

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ability to run execution graph from an external application

dm0 opened this issue · comments

Please add ability to run compiled with native backend CGT graph from C or C++ application.

Intended pipeline:

  1. Create computation graph with CGT
  2. Build it using native backend
  3. Link result object files with external application
  4. Execute graph.

Thank you.

I guess that it is already possible somehow, as you do communicate with native binary from python.
Could you point me to the right place to look for code that does this communication?
If it is possible to save CGT-generated C/C++ sources this probably could help me with understanding how to run graph from external application.

Thanks.

We're working on this, stay tuned.

@joschu What do you think about cross-language execution graph serialization, with JSON or protobuf? Right now execution graphs are given to the C++ interpreter via Cython reading Python objects, but for this case we want a standalone C++ library to be able to read a serialized execution graph. This will introduce a dependency into the C++ side (for reading the serialized representation with a JSON library, for example), and might also introduce it to the Python side (if we use protobuf, which is not installed by default).

I'm sorry to bother you...
Here is my two cents:

  • JSON -- using it involves base64 encoding/decoding which takes time and makes data about 30% bigger.
  • protobuf looks like a quite fat dependency. In my experience protobuf is rather inconvenient when using from python.

MessagePack (http://msgpack.org/) also looks quite prominent for binary serialization, though I don't have any experience with it.

Jonathan, I like the idea of having cross-language serialization using some standard format.

One constraint is that we'll need to serialize numerical array data. This would be kludgy in JSON.
Protobuf would be an OK choice, though it might be a bit too heavyweight.

If I remember correctly, @pcmoritz also recommended msgpack
At first glance, it looks like a good choice to me.

HDF5 is a good option and is already used in many Python projects.

On Tue, Sep 22, 2015 at 1:19 PM, John Schulman notifications@github.com
wrote:

Jonathan, I like the idea of having cross-language serialization using
some standard format.

One constraint is that we'll need to serialize numerical array data. This
would be kludgy in JSON.
Protobuf would be an OK choice, though it might be a bit too heavyweight.

If I remember correctly, @pcmoritz https://github.com/pcmoritz also
recommended msgpack
At first glance, it looks like a good choice to me.


Reply to this email directly or view it on GitHub
#31 (comment).

I was just referring to serialization of instructions/execution graph. This shouldn't involve serializing binary data, right?

The ReturnByRef and ReturnByVal instructions have to store the closure data for the Ops they're associated with. And a Constant Op has a value associated with it, so it'll be necessary to serialize the data.

I have started implementing serialization for execution graphs here: https://github.com/hojonathanho/cgt/tree/serialization
It's incomplete for now, but cgtArrays can be serialized.

I chose a C++-only serialization framework because I think that with the current way things are set up, it's best for serialization/deserialization to happen in C++/Cython only -- we should never be constructing execution graphs in Python anyway. It's also a lot faster to not have to worry about cross-language compatibility, since we can effectively directly serialize the bits in structs. Let me know if you think this is the right way to go.

Agreed that that it makes sense to serialize through in C++. But the closure data for the Ops is created in python. How do you plan to serialize each piece of closure data?