morganstanley / hobbes

A language and an embedded JIT compiler

Home Page:http://hobbes.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Example of a basic feed handler or its building blocks

robsmith11 opened this issue · comments

Apologies if there is a more appropriate place to ask this or I missed where it's covered in the docs.

I'd like to get an idea of how easily hobbes could replace kdb+/q as a language for building feed handlers. Are there any examples of either an entire feed handler or the components with which one easily build one? E.g., raw TCP sockets, HTTP queries, SSL, websockets, CSV/json/FIX parsing, compressed log file writing, etc.

Unfortunately, "hobbes" isn't the easiest name to google, so I haven't found any resources beyond the docs here.

Hi Rob, this is actually a good place to discuss things like this (we wind up with a lot of discussion under "issues" sometimes).

Many of the feed handlers we run are written in C++ and use structured logging or one of the included header-only libs to write structured data files directly (fregion.H for general purpose, cfregion.H to write compressed series -- for high-volume market data we almost always write compressed).

Our main reason for writing these in C++ has been that we have to use existing C++ libraries to access most of this data, we're not just opening sockets and reading packets.

Having said that, we do have some large query processes where we write out structured data files as indexes from hobbes (which could just as easily be from processing feeds) so it's definitely doable. There are some ongoing discussions about open sourcing more of that kind of tickerplant infra (but it's been discussed for a while, kind of a sensitive topic to open source more of that sort of thing).

Generally, we wind up with a lot of structured data files produced by different processes, coming together in a few large machines where we have scripts that will index incoming data and define a simple query interface to all of the data (which is one place where all of the type class machinery comes in handy). We also use the networking infra to have query processes that distribute and aggregate queries across these machines where all of this data is stored (which has some interesting challenges to make sure those distributed queries type-check but without being more onerous than e.g. distributed kdb queries).

There's some overlap with kdb, some areas where we've got a better story and some areas where they do. I'd like to push that even further, because I think we've got the better foundation, but I'm one guy and have other things I have to work on in my day job, so it will take a while.

We do have syntax for defining parsers, which might be helpful to write the kind of examples you're looking for w.r.t. CSV/json/FIX. There are some examples of that in the project readme, which might be helpful if you'd like to experiment with that feature.

Maybe we can discuss specifically what a good example program would do for you. We'd probably want to import existing libs for something like SSL.

Thanks for the detailed reply! I hadn't realized the extent to which you were still relying on C++ for the feed processing. It seems that hobbes is really just focused on providing the fast and robust run-time flexibility for the core logic of how data feeds interact after they've already been parsed and piped into hobbes.

I don't need quite that level of run-time flexibility, so it's nice to be able to use kdb's built-in web socket support with SSL and easy json, csv, and key-value pair parsing. As my performance and flexibility needs change I'll look into incorporating some more C++ components and switching to hobbes.