Execute queries in parallel on a single connection

Question

Execute queries in parallel on a single connection

AtnNn opened this issue 10 years ago · comments

Not to be confused with #2156, which is about evaluating parts of a single query simultaneously.

The JavaScript driver and many community drivers are asynchronous, but they don't get the full benefit of it because the server will only evaluate one query at a time on a single connection.

This also causes pathological behaviour when using changefeeds. Each changefeed adds a possible latency of 500ms to all other queries on the same connection. The only sane way to use changefeeds is to open a new connection for each changefeed. I believe this problem would be better solved by allowing parallel execution of queries on a single connection and not by adding server pools.

Another disadvantage of the current situation is that listening on an empty changefeed will cause an empty response to be sent every 500ms. This causes useless wake-ups and network traffic.

With this proposal, if a user waits for the result of a query before sending the next query to the server, queries would still be executed sequentially.

When this is implemented, we should include a backwards-compatible mode. Some drivers rely on the fact that responses are in the same order as their respective requests.

To avoid complications in the web UI, it may be easier to not make this change for HTTP connections.

This proposal is based on my understanding of the current behaviour, please correct me if I am wrong.

@mlucy @danielmewes Any thoughts?

Bryan Morris commented 9 years ago

👍

Jeroen Habraken commented 9 years ago

👍

Michael Lucy commented 9 years ago

Yes.

Sam Hughes · Answer 1 · Thu Nov 06 2014 23:11:04 GMT+0800 (China Standard Time)

I don't believe that there exist drivers that rely on responses arriving in the same order as their requests. Which ones do?

Etienne Laurin · Answer 2 · Thu Nov 06 2014 23:33:24 GMT+0800 (China Standard Time)

@srh I was thinking about community drivers. Over a year ago, if I remember correctly, the Python driver's network code would discard responses until it got one with a matching token.

Sam Hughes · Answer 3 · Thu Nov 06 2014 23:53:42 GMT+0800 (China Standard Time)

That's crazy if the Python driver discards responses instead of erroring. It knows the server is sending back garbage! Any driver with a synchronous API shouldn't be sending two queries without waiting for the first to respond, unless the first was a noreply query, or unless it was deliberately designed to do so (which is possible, but unlikely they'd mistakenly rely on message ordering if that was so).

Etienne Laurin · Answer 4 · Fri Nov 07 2014 00:23:11 GMT+0800 (China Standard Time)

The Python driver does not have that behaviour anymore.

However the JavaScript driver does send new queries to the server without waiting for previous queries to respond. I believe other community drivers do too. It is the behaviour people are led to expect when there are unique tokens in the protocol.

Sam Hughes · Answer 5 · Fri Nov 07 2014 00:47:09 GMT+0800 (China Standard Time)

I don't believe any of those are relying on responses arriving in the same order as requests.

Michael Lucy · Answer 6 · Fri Nov 07 2014 06:48:28 GMT+0800 (China Standard Time)

The reason we've hesitated to do this is that in asynchronous languages like JS people sometimes write things like:

table.insert(...).run(conn, callback)
table.filter(...).run(conn, callback)

and expect the filter to see the write performed by the insert.

It might be worth just putting up with that problem, though. Tagging this as RQL_proposal, we should talk about it during the next discussion period.

Daniel Mewes · Answer 7 · Fri Nov 07 2014 06:58:16 GMT+0800 (China Standard Time)

Dropping the guarantee that queries on a given connection wait for all earlier ones to complete before running would also simplify the design of a connection pool API #281 . More discussion in the next RQL_proposal period then.

Alisson Cavalcante Agiani · Answer 8 · Fri Nov 07 2014 07:14:05 GMT+0800 (China Standard Time)

@mlucy for the Javascript people out there I think it is just a matter of encouraging them to leverage Promises or Generators for pseudo sync execution.

Sam Hughes · Answer 9 · Fri Nov 07 2014 12:56:59 GMT+0800 (China Standard Time)

and expect the filter to see the write performed by the insert.

As they should. It would be a bug otherwise.

Alisson Cavalcante Agiani · Answer 10 · Fri Nov 07 2014 13:05:06 GMT+0800 (China Standard Time)

@srh but the way @mlucy wrote it would be common in Node-land to expect both queries to run without guaranteed order

Etienne Laurin · Answer 11 · Fri Nov 07 2014 13:13:12 GMT+0800 (China Standard Time)

As a reference, a correct way to sequence queries in JavaScript would be:

table.insert(...).run(conn, function(error, result){
  if(error || result.last_error){ ... }
  table.filter(...).run(conn, callback);
}

Alisson Cavalcante Agiani · Answer 12 · Fri Nov 07 2014 13:16:49 GMT+0800 (China Standard Time)

exactly, or if you're into generators...

yield table.insert(...).run()
yield table.filter(...).run()

Bryan Morris · Answer 13 · Sat Nov 08 2014 01:37:07 GMT+0800 (China Standard Time)

@srh, @thelinuxlich is right. In Nodejs the programmer is responsible for making sure the callbacks fire correctly. I would love not to have to open a new connection for every asynchronous database call.

Michel · Answer 14 · Sat Nov 08 2014 05:22:14 GMT+0800 (China Standard Time)

The reason behind the current behavior was the following: If you issue a write and then a read on the same connection, the read should see the write.

Some people felt strongly about that a long time ago, and we adopted the current behavior. I think it's fair to run the queries in an asynchronous fashion, especially since:

The guarantee is wonky. You don't see the read if the write fails, and you have no guarantee that no write will overwrite your first write before the read.
If you want to see your write, you can now use returnChanges

The current behavior is also confusing I think. Users currently cannot build a safe connection pool (where a query is guaranteed not to be issued on a connection already used) without automatically coercing cursors, and forbidding feeds. This is mostly because we run CONTINUE queries under the hood. All the work behind rethinkdbdash's pool was to work around this limitation.

Also in my opinion it's expected that if you want a synchronous flow for asynchronous operation in Node.js, you must nest calls, use a library like async, or generators.

Bryan Morris · Answer 15 · Sat Nov 08 2014 05:33:00 GMT+0800 (China Standard Time)

@neumino After noticing that my controller methods were getting processed serially, I finally just gave up and started opening new connections for every....single....query.

It almost seems counter-intuitive, but performance shot WAAAAAY up.

Daniel Mewes · Answer 16 · Sat Nov 08 2014 05:36:42 GMT+0800 (China Standard Time)

Yeah, I'm pretty sure at this point that getting rid of that guarantee is the way to go.

We'll make a complete plan for how to proceed about this after shipping 1.16. As @neumino said, this is also relevant for the question of how to implement connection pools, and also matters for #3298.

Daniel Mewes · Answer 17 · Thu Jan 29 2015 03:53:41 GMT+0800 (China Standard Time)

Despite a certain potential to expose new bugs, I'm scheduling this for 2.0.

The reason is that having multiple changefeeds open on the same connection (as discussed in #3298 and #3678 ) isn't practicable without this change.

We should be conservative about when to enable this feature. I suggest two restrictions to this on the server side:

only ever process one request at a time per request token. Make sure that per token, we send back responses in the same order in which the requests arrived. This avoids a whole bunch of potential race conditions (e.g. what if some cursor implementation sends two CONTINUE requests? what if it sends a CONTINUE and then a STOP before receiving the response to the CONTINUE?).
Increase the protocol version magic. Only enable concurrent query execution for drivers that send the new magic. That way we can be sure not to break any existing (third party) drivers that are not prepared to handle parallel query execution.

Daniel Mewes · Answer 18 · Thu Jan 29 2015 04:10:05 GMT+0800 (China Standard Time)

To be clear regarding "Make sure that per token, we send back responses in the same order in which the requests arrived.":
I think it's enough to just keep the lock on the token until we've completely sent the response. I would avoid doing anything more complex (like pipelining) for 2.0.

Graham Hughes · Answer 19 · Wed Feb 04 2015 12:39:38 GMT+0800 (China Standard Time)

I've complained in person that we shouldn't re-use tokens, like, at all; it just invites bugs. If we're bumping the protocol magic anyway we could put something else in like "when you send a CONTINUE request, you supply in addition the token you're going to query next" or alternatively the server sends back a different token, or something similar. The idea being that it should be impossible to send two CONTINUEs to the server before getting a response back from either one, and then trying to make sense of the resulting situation. My intent here is that the server will send back information from one and then go "we already did this" with the other. If the server sends back the token to use for the next request, it's actually impossible to submit two valid read requests on the wire before a response is gotten back from either one.

That said I do agree that we should worry about how much of it should be automatically parallelizable. Changefeeds seem like an obvious instance where the default should be YES, parallelize. Presumably if we made an option to .run that said "please run me in parallel" that would work too. Are there any other default-YES situations we should worry about?

Michael Lucy · Answer 20 · Wed Feb 04 2015 13:30:01 GMT+0800 (China Standard Time)

I think we should default to parallelization for all queries (with some cap on the number of coroutines we spawn). For asynchronous drivers like JS it makes more sense, because it's entirely plausible people will just be firing off queries while other queries are queued up.

Daniel Mewes · Answer 21 · Thu Feb 05 2015 02:48:04 GMT+0800 (China Standard Time)

I think we should default to parallelization for all queries

👍

Michael Lucy · Answer 22 · Sat Feb 07 2015 07:28:45 GMT+0800 (China Standard Time)

There seems to be general consensus on this; planning to mark it as settled on Monday.

Daniel Mewes · Answer 23 · Sat Feb 21 2015 07:16:26 GMT+0800 (China Standard Time)

@mlucy this is done except for #3754 isn't it?

Michel · Answer 24 · Sat Mar 14 2015 13:59:21 GMT+0800 (China Standard Time)

If I open a feed, and send a CONTINUE query, I won't get any response until the server sees a change with (v0_4).
So if no change happens and if I want to close the feed, I have to send the STOP query.

My question is:

What happens in this case? I seem to be stuck (no response is returned) for the CONTINUE or STOP query.
Is the CONTINUE query supposed to return nothing? An error?

Michel · Answer 25 · Sat Mar 14 2015 14:11:11 GMT+0800 (China Standard Time)

One more thing: If I force the STOP query, and trigger a change after that, the CONTINUE query returns an empty SUCCESS_SEQUENCE, but the STOP query will throw with something like Token X not in cache.

Michael Lucy · Answer 26 · Sun Mar 15 2015 05:00:24 GMT+0800 (China Standard Time)

@neumino -- that sounds like a bug to me; the STOP query should be interrupting the CONTINUE query. Good catch!