amphp / postgres

Async Postgres client for PHP based on Amp.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

rfc: better transaction disposing mechanism

azjezz opened this issue · comments

while working on #50, i noticed a weird behavior within the library.

a query cannot be executed ( on the same connection ), unless all handles related to a specific transaction are released, however, the library doesn't offer a way to force release all handles.

Let's take the following example:

/**
 * @template T
 * @param Closure(Executor): T $operation
 * @return T
 */
function transactional(Link $link, Closure $operation): mixed {
  $trans = $link->beginTransaction();
  try {
    $result = $operation($trans);
    $trans->commit();
    return $result;
  } catch(Throwable $e) {
     $trans->rollback();
     throw $e;
  }
}

$link = ...;
$users = transactional($link, static fn($executor) => $executor->query('SELECT * FROM users'));

so far so good, we have a function that given a database link, and an operation, will execute the operation within a transaction, and return it's results.

however, let's take this a step further, simulating a more real-world usage of transactional function.

$users = transactional(
  $link,
  static fn($executor) => $executor->execute('SELECT * FROM forum_users WHERE role = ?', [User::ROLE_ADMIN]),
);

foreach($users as $user) {
  if (admin_has_been_active($user)) {
    continue;
  }

  if (!admin_warned_about_demotion($user)) {
    warn_admin_about_demotion($user);
    continue;
  }

  if (!admin_warned_about_demotion_over_a_week_ago($user)) {
    // hasn't been a week
    continue;
  }
  
  $link->execute('UPDATE users SET forum_users = ? WHERE id = ?', [User::ROLE_SUPER, $user['id']]);
}

here, we get the list of our forum admins, iterator over them, and if one of them admin has not been active, we warn them that their account is going demoted next week, if the have been warned over a week ago, we demote their account to super users.

However, this currently fails.

Since $users is a result set, this result set still has a reference to a handle that is associated with the transaction used to retrieve the users, while arguably the initial query shouldn't have been made within a transaction, remember that this is just an example.

What i suggest is the following:

if a transaction has been rolled-back, or committed, all handles should be force released, this can be done in one of the following ways:

  1. force release and mark objects with a reference to any handle as invalid ( e.g iterating over users will fail ).
  2. do all the work immediately ( e.g: $users needs a reference to the handle in order to fetch users while iterating, however, when a commit() or a rollback() is executed, we can force the result to fetch all users immediately, making it drop the reference to the handle, and still be valid.

I personally prefer the second solution, while it might be bad for memory when dealing with large queries, it is the most sane behavior in my opinion, as if the user is holding a reference to that object, they are probably still going to use it, if not, they can manually call dispose() on the result.

not sure if amphp/mysql suffers from this issue as well, as i have been mostly playing around with amphp/postgres lately, though, i would assume it does.

IMO, the example should be processing the user list and updating it within the callback given to transactional, otherwise why was the SELECT query performed in a transaction at all? This makes me favor option 1 as that would prevent such misuse.

while arguably the initial query shouldn't have been made within a transaction, remember that this is just an example.

This is just an example, $users could have been returned from UserManager::fetchAllAdmins() method, which calls another service, calling another service, and somewhere deep down, it's nested within a transaction.

A service would not commit a transaction. For the service query to be executed within the context of a transaction, the Transaction object would need to be used for the query instead of the pool. Once a transaction is started on a connection, all activity on the connection should go through the Transaction object until it is committed/rolled back. The result set preventing another query actually has little to do with being in a transaction – you wouldn't be able to execute the update query on the same connection while consuming the result set.

Using a Transaction object for a service is certainly problematic as it will need to be injected as part of the request. Maybe there's something better we could do, I'd have to give that some thought.

Using a Transaction object for a service is certainly problematic

That doesn't have to be the case, e.g: using Neu\Database\DatabaseInterface ( still early in dev, so the API is not final ).

the service could look like this:

use Neu\Database\DatabaseInterface;

final class SomeService
{
  public function __construct(
    private readonly DatabaseInterface $database,
  ) {}

  /**
   * @return iterable<User>
   */
  public function fetchAllAdmins(): iterable
  {
    return $this->database->transactional(function($transaction): iterable {
      foreach($transaction->fetchAll('users', ['role' => User::ROLE_ADMIN]) as $data) {
        if ($this->isStillUsingRawPassword($data)) {
          $this->hashPassword($transaction, $data);
        }

        yield User::fromArray($data);
      }
    });
  }
}

and this is a proof that solution 2 doesn't work :), i agree on going with 1, with a clear error message ( on where the commit()/rollback() happened that resulted in the dispose() of the result set )

not sure if amphp/mysql suffers from this issue as well, as i have been mostly playing around with amphp/postgres lately, though, i would assume it does.