splitgraph / seafowl

Analytical database for data-driven Web applications 🪶

Home Page:https://seafowl.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Return timing data in response header(s)

milesrichardson opened this issue · comments

Related: #356

Feature Request

It would be nice if the response from Seafowl included some information about execution time in the response headers. Currently, a client has no insight into execution metrics, other than measuring the total round trip time of the request and response.

Motivation

Having access to more timing data would be helpful for diagnostics, and for writing interactive query tools similar to the Splitgraph Query Console. For example, the Splitgraph DDN provides fields executionTime and executionTimeHighRes, which represent the time spent by the Splitgraph Engine "processing the query," which, in effect, is the interval defined by the boundaries where the "DDN web bridge" stops reading the HTTP request, and when it starts writing the HTTP response.

Open questions:

What should be measured?

  • A minimal implementation should measure the total processing time, as delineated by the interval between when Seafowl finished reading the HTTP request body, and when it started writing the HTTP response body.
  • Other more granular "internal" metrics might also be helpful, e.g. time spent resolving tables or fetching an external resource. But this generalizes to a telemetry framework, since it's possible to measure the time between any two events. So perhaps there are some important metrics worth including by default.

How should it be returned in the response headers?

All considerations from #356 (comment) apply:

  • Each metric could be its own response header
  • All metrics could be in one response header
  • The most important metrics could be in the parameters of the Content-Type header, similarly to how we might return the field types. (The advantage of this option is that Content-Type is a default CORS-safelisted HTTP response header.)

For the content-type option, perhaps any non-field metadata could be in parameters conventionally prefixed with __, e.g.:

content-type: application/octet-stream; __exec_ms=15; __resolving_ms=20; __total_ms=35; 
total_volume=FLOAT; year=INT;
...
{"total_volume":78.0,"year":2018}
commented

OK, after flailing around in Warp and it's world of Filters, I ended up switching gears and now added an explicit timer inside cached_read_query /uncached_read_write_query.
It's very simple and not yet DRY'd but wanted to push this up as a PoC of 'whole request' timing.

Probably whether Seafowl returns this header should either be 1) configurable or 2) perhaps it should follow warp::trace::request()'s footsteps and return only at INFO/DEBUG level. Or maybe both. (?)

Thoughts?

image

Closed by #438