dunglas / vulcain

Fast and idiomatic client-driven REST APIs.

Home Page:https://vulcain.rocks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Question] Relations preloaded within first 'main' request, making the gateway server requests very slow

matthijsoberon opened this issue · comments

Hi,

First of all, very interesting project!

I'm implementing the Vulcain Gateway server in my application and it's performing slow. To be sure it's not the fault of my application I downloaded and ran the demo and this is also slow at preloading the relations.
It looks like Vulcain is preloading every relation in the main request and slowing down the request when it's calling the 'conferences' endpoint:
Schermafbeelding 2020-03-10 om 15 27 05
This is in contrast to the slide in the Symfonycon slides, where the preloading 'conferences' request takes about the same time (bit more than 4sec) it needs to request the 'conferences' endpoint without preloading:
image

I'm not sure if this is how Vulcain is supposed to work, but if it is I don't really see how Vulcain can be faster than GraphQL for example?

Hi @matthijsoberon . I actually have the same trouble. My "main" request (the one with the preload header) is spiking from 500ms~1s to almost ~6s. But the global load time seems about the same. I am still investigating to see if it is a bad NginX configuration.

tl;dr: Use Varnish ⚡️!

To avoid race conditions, the HTTP/2 spec recommends to not start sending the body of the main request before the push promises of the related resources. The main response's body could (and will in our case) contains references to these related resources, and the client could start a request to download them while they will also be pushed by the server, creating a race condition.

The server SHOULD send PUSH_PROMISE (Section 6.6) frames prior to
sending any frames that reference the promised responses. This
avoids a race where clients issue requests prior to receiving any
PUSH_PROMISE frames.

Because of this, when you use the gateway server (and, as documented, only in this case, more about this at the end of this post), the gateway server have to wait for the responses of all resources to push before starting to send data to the client. This is not usually an issue, except when your upstream API is slow (more on how to fix this later).

There is nothing we can do about that, and GraphQL as the exact same problem: for instance, when you use Apollo with a REST Data Source (which is the GraphQL setup the most similar with what the Vulcain Gateway Server does), Apollo will have to wait for all responses coming from the REST API (or any other data source actually) to start sending the JSON document to the client.
But then, Apollo will not be able to send this monolithic response (the big JSON document) in parallel (you cannot break a same JSON document into multiple HTTP/2 streams), while the Vulcain Gateway Server will push all the separated resources in parallel in different HTTP/2 streams (because they are different resources, not a single big JSON document). It's why, even in this case, Vulcain is theoretically faster than GraphQL.

So the main issue here, is that your upstream data source (your internal REST API) is slow. If you don't use Vulcain, and just make all the requests directly need from the client, it will be even slower (because you'll have the under-fetching and the n+1 request in addition).

I assume you use PHP-FPM or mod_php. Unlike Node, Go, Java and even things such as AMPHP and ReactPHP, these soltions are known to have a bootstrap time for every requests because of their "fire and forget" nature. It's even worst if - for instance - use the dev mode of Symfony, Laravel and similar frameworks (you'll have the overhead for every pushed responses, and it can quickly become very slow!).

To prevent this - as I've done for my SymfonyCon demo an easy trick is to not call PHP when not necessary.
HTTP provides a nice mechanism for that: HTTP cache. In my demo, I used Varnish (between the upstream PHP API and the Vulcain Gateway server) to store responses of the upstream API in cache (I also used the invalidation mechanism provided by API Platform to be sure that the cache never goes stale). So the Gateway server, which is on the same local network, or even the same machine than the Varnish server and the upstream REST API, can fetch resources it needs in a few milliseconds. Also, be sure to run the upstream API in prod mode.

By the way, such setup would also work with a GraphQL gateway as presented above.

Another option is just to not use the Gateway Server and to implement Vulcain directly at the app layer. For instance, the app could to the SQL query to retrieve add the needed data, compute the dependency graph, send all the push promises first, and then start sending in parallel all the responses, in separate HTTP/2 streams. This approach (without a gateway), is similar to what API Platform does with the nested documents and GraphQL features, but is faster because the client will download all resources in parallel (and again, it's not possible with a single big JSON document).
To do so, if your project is written in Go, you can even use the code in this repo as a library to parse the Vulcain HTTP headers and compute the requested dependency graph.

It's also doable in Node, Java etc... but unfortunately not with PHP-FPM and mod_php because they don't manage the TCP/UDP connection directly, and so aren't able to create new HTTP/2 streams directly (only the web server itself can).
If you want to follow this path with PHP, you'll have ot use AMPHP (because it has a HTTP/2 server written directly in PHP, which manages the TCP connection directly) or ReactPHP. Honestly, I'm not sure it is worth it yet, configuring Varnish properly should be good enough!

Thanks for the clarification @dunglas . Though I don't really see the benefits of using vulcain now. With it it requires to do all the requests and the TTFB is bigger than doing a simple first request and then iterating over the results. I didn't tries to configure varnish yet but even with it you multiply the TTFB. Isn't it ?

To be exact, the gateway server pushes the body of all responses as soon as possible except the body of the explicit one (because of what I described in my previous comment).

If your API is on the same local network and replies quickly (again, use the layering capabilities of HTTP - Varnish for example -, and it will take only a few ms for the gateway server to fetch all data over the local network), it shouldn't be an issue. Also, the gateway server will download the responses in parallel from the Varnish server, so you don't multiply the TTFB.

The typical bottleneck is the network between the server and the client, not the local network. And Vulcain helps a lot for this (because of parallelism). You can check that on your development environment by using the throttling feature of your browser's dev tools.

That being said, of course fetch only what the client need for the first render. It will always be faster to not download what is not needed, or can be fetched later (after a click or something like that).

Thanks for the great in depth explanation! I implemented Varnish in the demo and put the application in production mode and it responds a lot faster. Going to try it in my project as well.

@dunglas maybe I misunderstood the spec, but wouldn't it be possible to Server-Push a response after it's direct children have been Push-Promise, without waiting for the grand-children to be resolved ?

eg, this kind of timeline :

Push-Promise /root/1/children/1
Push-Promise /root/1/children/2
Response /root/1

Push-Promise /root/1/children/1/grand-children/1
Push-Promise /root/1/children/1/grand-children/2
Server-Push /root/1/children/1

Push-Promise /root/1/children/2/grand-children/3
Push-Promise /root/1/children/2/grand-children/4
Server-Push /root/1/children/2

Server-Push /root/1/children/1/grand-children/1
Server-Push /root/1/children/1/grand-children/2
Server-Push /root/1/children/2/grand-children/3
Server-Push /root/1/children/2/grand-children/4