Supporting Incremental Delivery with @defer/@stream directives
potatosalad opened this issue · comments
I wanted to open an issue for discussion of incremental delivery with the @defer
and @stream
directives in Argo.
Following the specs in graphql/graphql-spec#742, which is now in the "Draft (RFC 2)" state, the following is a potential wire-type solution for dealing with these incremental responses:
schema {
query: Query
}
type Query {
root: Object!
}
type Object {
children: [Object!]!
}
query {
root {
required: __typename
... @defer {
deferred_x: __typename
}
... @defer(label: "defer_z") {
deferred_y: __typename
}
children_x: children @stream {
streamed_x: __typename
}
children_y: children @stream(label: "stream_z") {
streamed_y: __typename
}
}
}
The examples below use an ERROR
wire type that is an alias to the following:
{
message: STRING<String>
location?: {
line: VARINT<Int>
column: VARINT<Int>
}[]
path?: PATH
extensions?: DESC_OBJECT
}
In addition, a new wire type referred to as UNION
(a tagged union type similar to the one found in BARE).
{
data?: {
root: {
required: STRING<String>
deferred_x?: STRING<String>
deferred_y?: STRING<String>
children_x: {
streamed_x: STRING<String>
}[]
children_y: {
streamed_y: STRING<String>
}[]
}
}?
incremental?: UNION {
<0>: {
path: PATH
data: {
deferred_x: STRING<String>
}?
errors?: ERROR[]?
extensions?: DESC_OBJECT
}
<1:"defer_z">: {
path: PATH
data: {
deferred_y: STRING<String>
}?
errors?: ERROR[]?
extensions?: DESC_OBJECT
}
<2>: {
path: PATH
items: {
streamed_x: STRING<String>
}[]
errors?: ERROR[]?
extensions?: DESC_OBJECT
}
<3:"stream_z">: {
path: PATH
items: {
streamed_y: STRING<String>
}[]
errors?: ERROR[]?
extensions?: DESC_OBJECT
}
}[]
hasNext?: BOOLEAN<Boolean>
errors?: ERROR[]?
extensions?: DESC_OBJECT
}
Any initial thoughts or opinions? Thanks!
I think this makes a lot of sense. In Argo, we'd probably want to note that it's subject to change to whatever is eventually incorporated into the GrapQL spec.
Here are a variety of notes and thoughts:
- It's a small bummer to introduce a new UNION type, but I think it's the best option. I think we'd use a VARINT tag for implementation simplicity, even though the negative numbers will be wasted (so using many different
@defer
/@stream
in the same query will use a few extra bytes). One alternative to UNION would be to fake it in a synthetic RECORD with lots of fields (perhaps one RECORD for@stream
and one for@defer
) but that feels unnecessarily awkward to me, and the payloads would be unnecessarily large. - The
items
field should be nullable in@stream
payloads (I see you already madedata
nullable for@defer
payloads) - Inside the union, the
path
is fully knowable for@defer
, and knowable except the final index for@stream
. We could avoid encoding it (perhaps including only the index), or drop the known prefix as we do in Argo Field errors and add it back in after decoding. I'm inclined toward this last, for consistency's sake. - Inside the union,
label
(if any) is always known. We can avoid sending it over the wire or including it in the wire type, but make it available to users. (I think you had this in mind, guessing from your UNION syntax) - In
incremental
, I thinkextensions
should be nullable (as well as omittable) - In ERROR, I think
extensions
,path
, andlocation
should be nullable (as well as omittable) - In implementations, this may result in in-memory wire types becoming pretty large due to the repetition of
path
,extension
, and especiallyerror
field types. It's not a new problem for this sort of thing, but careful use of generics and other language/runtime features that help with sharing or parameterization will be useful where performance matters. I'd expect the JSON Wire schema serialization of these to get unwieldy quickly.
Thanks for looking into this!
- It's a small bummer to introduce a new UNION type, but I think it's the best option. I think we'd use a VARINT tag for implementation simplicity, even though the negative numbers will be wasted (so using many different
@defer
/@stream
in the same query will use a few extra bytes). One alternative to UNION would be to fake it in a synthetic RECORD with lots of fields (perhaps one RECORD for@stream
and one for@defer
) but that feels unnecessarily awkward to me, and the payloads would be unnecessarily large.
Yeah, I played around with a few different options here as well, but I thought a simple VARINT tagged UNION might be the simplest solution. The label representation <1:"defer_z">
would be internal to the wire-type only and useful for converting an Argo value back into the JSON representation where the {"label": "defer_z", "path": ..., "data": ...}
would need to be inserted. For wire encoding/decoding, it would just be a normal VARINT(1)
.
- The
items
field should be nullable in@stream
payloads (I see you already madedata
nullable for@defer
payloads)
Oh, yup, you're correct. I wrote the pseudo wire type by hand so there may be other accidental mistakes.
- Inside the union, the
path
is fully knowable for@defer
, and knowable except the final index for@stream
. We could avoid encoding it (perhaps including only the index), or drop the known prefix as we do in Argo Field errors and add it back in after decoding. I'm inclined toward this last, for consistency's sake.
The path
for @defer
is only partially known prior to execution, consider the case where @defer
occurs underneath an array:
schema {
query: Query
}
type Query {
x: X!
}
type X {
ys: [Y!]!
}
type Y {
z: Z!
}
type Z {
name: String!
}
query {
x {
ys {
z {
__typename
... @defer {
name
}
}
}
}
}
The path
for the @defer
in this case would be ["x", "ys", VARINT, "z"]
where VARINT will be a separate incremental path reply for each item under ys
(the same can be said for nested cases of @stream
).
- Inside the union,
label
(if any) is always known. We can avoid sending it over the wire or including it in the wire type, but make it available to users. (I think you had this in mind, guessing from your UNION syntax)
Correct, it's primarily used for converting from and to the JSON representation, only the VARINT index is encoded/decoded for the tagged UNION.
- In
incremental
, I thinkextensions
should be nullable (as well as omittable)
There's some discussion in the PR about this:
The GraphQL server may determine there are no more values in the response stream after a previous value with
hasNext
equal totrue
has been emitted. In this case the last value in the response stream should be a map withoutdata
andincremental
entries, and ahasNext
entry with a value offalse
.
I don't think {"incremental": null}
has any meaning the same way that {"data": null}
does, but would instead always be something more like {"incremental": [{"path": ..., "data": null, "errors": [...]}]}
instead.
At least that's my current understanding after reading through the specs. For extensions
, see my comment below and let me know what you think.
- In ERROR, I think
extensions
,path
, andlocation
should be nullable (as well as omittable)
Following the wording the Errors section of the GraphQL Spec:
If present, the
errors
entry in the response must contain at least one error. If no errors were raised during the request, theerrors
entry must not be present in the result.
…
Every error must contain an entry with the keymessage
with a string description of the error intended for the developer as a guide to understand and correct the error.
…
GraphQL services may provide an additional entry to errors with keyextensions
. This entry, if set, must have a map as its value.
Nothing is explicitly stated about path
and location
, but I had interpreted it to have similar meaning to extensions
where it either should be present and of a specific format, or otherwise omitted entirely.
This also seems to imply that the typing for errors?: ERROR[]?
might be better represented as errors?: ERROR[]
.
What do you think?
- In implementations, this may result in in-memory wire types becoming pretty large due to the repetition of
path
,extension
, and especiallyerror
field types. It's not a new problem for this sort of thing, but careful use of generics and other language/runtime features that help with sharing or parameterization will be useful where performance matters. I'd expect the JSON Wire schema serialization of these to get unwieldy quickly.
Yeah, I thought about potentially introducing a RESPONSE
wire type that might make it easier for implementations to (1) reference fragments of records and (2) make path
validation more standardized. Something like:
{
"type": "RESPONSE",
"data": {
"type": "RECORD",
"fields": [...]
},
"incremental": [
{
"type": "DEFER",
"index": 0,
"data": ...
},
{
"type": "STREAM",
"index": 1,
"item": ...
}
]
}
Internally, it could expand to the full wire-type involving errors
and extensions
.
Fields underneath the data
key for the RESPONSE
could reference the incremental portions with something like {"type": "FRAGMENT", "index": 0}
or similar.
This would also make it so the encoding for PATH
could have its starting point underneath data
, which matches how it's used in the JSON encoding.
Yeah, I played around with a few different options here as well, but I thought a simple VARINT tagged UNION might be the simplest solution. The label representation <1:"defer_z"> would be internal to the wire-type only and useful for converting an Argo value back into the JSON representation where the {"label": "defer_z", "path": ..., "data": ...} would need to be inserted. For wire encoding/decoding, it would just be a normal VARINT(1).
Yeah, it would be handy to have the label value available. Instead of baking it into the union (which is probably the only place it will be used), I'm somewhat more inclined to introduce a type like CONST_STRING:
incremental?: UNION {
...
<1>: {
label: CONST_STRING="defer_z"
path: PATH
data: {
deferred_y: STRING<String>
}?
errors?: ERROR[]?
extensions?: DESC_OBJECT
}
Another small bummer, but clear enough. It would naturally extend to constants of other types, or even default values, but GraphQL has little or no need for these at the moment.
The path for @defer is only partially known prior to execution, consider the case where @defer occurs underneath an array:
...
Great explanation, thanks. The same probably applies to @stream
as well. In that case, perhaps it's simplest to leave PATHs unmodified. Of course, the main alternative would be to truncate at the first list/index. I'm not sure it's worth the hassle.
In incremental, I think extensions should be nullable (as well as omittable)
...What do you think?
Your reasoning makes sense to me, I had just checked what types I used in the reference implementation for ERROR. IIRC I wanted to support whatever JSON folks might have, but I like the stricter approach you take.
Yeah, I thought about potentially introducing a RESPONSE wire type ...
Nice. I think for now it's probably simplest to leave it up to implementations, and have the spec use the maximally-expanded version which everything must eventually be equivalent to.