r2dbc / r2dbc-spi

Service Provider Interface for R2DBC Implementations

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Refactor TestKit from using own resource management to reuse `Flux/Mono.usingWhen(…)`

Michael-A-McMahon opened this issue · comments

The TCK's TestKit.returnGeneratedValues() method is implemented to close the Connection before a Result.map(BiFunction) Publisher has terminated:

.concatWith(close(connection)).flatMap(it -> it.map((row, rowMetadata) -> row.get(0)));

This requires the driver to cache generated values so that Row objects may be consumed after the connection is closed.

The closing of the connection in this TCK method is curious, as it seems to conflict with behavior that is specified in the javadoc of Connection:

Objects created by a connection are only valid as long as the connection remains open.

In Oracle's R2DBC implementation, we handle this by draining row data into a List that can be published after the connection is closed:
https://github.com/oracle/oracle-r2dbc/blob/527b7e181d082d9f52b4e5334d941f577c9291ef/src/main/java/oracle/r2dbc/impl/OracleResultImpl.java#L211
This implementation seems very anti-reactive (pre-active?). I think a more reactive implementation would defer fetching rows from the database until a subscriber had singled demand.

Is the closing of the Connection in retrunGeneratedValues() simply a mistake? If so, then I think it should be corrected. (I can probably implement the change and create a pull request, if that's desired)

Or, is it the intent of the TCK to verify that generated values can be consumed after a Connection is closed? If so, then I think that should be defined in the specification of R2DBC.

No, results and rows are only valid if the connection is open. The TCK makes use of concatMap that activates a Publisher only after the previous publisher has completed.

I think a more reactive implementation would defer fetching rows from the database until a subscriber had singled demand.

Sounds good.

Is the closing of the Connection in retrunGeneratedValues() simply a mistake? If so, then I think it should be corrected. (I can probably implement the change and create a pull request, if that's desired)

As mentioned above, the close Publisher should be subscribed to after the result has completed its emission. If you experience other behavior, something needs to be fixed, likely the TCK. It would probably make sense to generally refactor resource management to usingWhen methods as these are ensuring resource disposal upon failure and cancellation.

It is true that the close Publisher is subscribed to after the Statement.execute() Publisher emits a Result, and so we can guarantee that the Result is emitted downstream before the connection is closed. But I don't think this can guarantee that the Rows of the Result have been mapped before the connection is closed.

The ordering between mapping Rows and closing the connection will depend on which thread invokes the Row mapping function. If the Row is mapped on the same thread that invokes onNext(Result), then the following sequence is executed on a single thread and we get the correct order:

  1. onNext(Result)
  2. Result.map(BiFunction)
  3. BiFunction.apply(Row, RowMetadata) outputs T
  4. onNext(T)
  5. ... Returning up the stack ...
  6. Subscribe to close()

With Oracle's implementation of the row publisher, we have the row mapping function executed on a different thread, so when onNext(Result) has returned there is no guarantee that the Row has been emitted and mapped.

Unless there is a requirement about which thread executes the mapping function, I think the concatMap operator should be sequenced after the map(BiFunction) Publisher terminates, not after the execute() Publisher terminates.

Threading should not impact the ordering of the sequence. We rather rely on signal ordering that isn't tied to thread switches.

But I don't think this can guarantee that the Rows of the Result have been mapped before the connection is closed.

Row and Result aren't something to be carried around really. The intention is to consume both within the stream that returns the result. Row is even specified as:

A row is invalidated after consumption in the {@link Result#map(BiFunction) mapping function}.

* <p>A row is invalidated after consumption in the {@link Result#map(BiFunction) mapping function}.

I also think that we should refactor the TCK to improve at least readability and robustness. With switching to Flux/Mono.usingWhen for connection management, the other things will also fall into place.

I changed the title to reflect what this is about.

Thanks for looking into this, mp911de.

I think usingWhen operators are great for the R2DBC use case where we need to close Connection objects. For JDBC, the try-with-resources expression made it a lot harder to leak java.sql.Connections, and I think usingWhen can do the same for R2DBC.

So if we refactor the returnGeneratedValues() method, we can write something like this:

  @Test
  default void returnGeneratedValues() {

    getJdbcOperations().execute(expand(TestStatement.DROP_TABLE));
    getJdbcOperations().execute(getCreateTableWithAutogeneratedKey());

    Flux.usingWhen(
      getConnectionFactory().create(),
      connection -> {
        Statement statement = connection.createStatement(getInsertIntoWithAutogeneratedKey());

        statement.returnGeneratedValues();

        return Flux.from(statement
          .execute())
          .flatMap(it -> it.map((row, rowMetadata) -> row.get(0)));
      },
      Connection::close)
      .as(StepVerifier::create)
      .expectNextCount(1)
      .verifyComplete();
  }

We get a nice scope in the connection -> { ... } lambda, which I think is helpful for programmers to reason about the happens before and happens after relationship between creating and closing the Connection. The programmer can just focus on what they need to do with the Connection inside of this scope, and they don't need to worry about managing the Connection resource so much. Once the Publisher returned from this scope emits a terminal signal, it indicates that the work performed using the connection has completed, and so the close() Publisher can be created and subscribed to.

Would this make sense?

Sounds about right.

Do you have any update on whether you plan to submit a pull request?

Yes, I can take this one. I can get started today. I'll plan to refactor all tests to use usingWhen; does that sound good?

Sure. I went today through the open issues and wanted to make sure that we don't miss anything.