jackc / pgx

PostgreSQL driver and toolkit for Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Does pgconn.Exec() really handles transaction control statements as code comments states?

jpargudo opened this issue · comments

Around:

// Exec executes SQL via the PostgreSQL simple query protocol. SQL may contain multiple queries. Execution is

The comment states:

// Exec executes SQL via the PostgreSQL simple query protocol. SQL may contain multiple queries. Execution is
// implicitly wrapped in a transaction unless a transaction is already in progress or SQL contains transaction control
// statements

I did test executing an SQL script like this one, 5 times:

begin;
insert into data (a) values ((round(random()*100+1))::integer); 
commit;
select pg_sleep(1);
begin;
insert into data (a) values (999);
commit;

I expected the Exec() function to execute the 1st transaction here, then wait 1s, then execute the 2nd transaction.

But looking at the database, I see everything here has been wrapped in one unique transaction, because with this table definition:

db1=> \d data
                                         Table « testuser.data »
┌─────────────┬─────────────────────────────┬─────────────────┬───────────┬──────────────────────────────┐
│   Colonne   │            Type             │ Collationnement │ NULL-able │          Par défaut          │
├─────────────┼─────────────────────────────┼─────────────────┼───────────┼──────────────────────────────┤
│ id          │ bigint                      │                 │ not null  │ generated always as identity │
│ insert_date │ timestamp without time zone │                 │           │ CURRENT_TIMESTAMP            │
│ a           │ integer                     │                 │           │                              │
└─────────────┴─────────────────────────────┴─────────────────┴───────────┴──────────────────────────────┘
Index :
    "data_pkey" PRIMARY KEY, btree (id)

I got this result:

db1=> select * from data;
┌─────────┬────────────────────────────┬─────┐
│   id    │        insert_date         │  a  │
├─────────┼────────────────────────────┼─────┤
│ 2514254 │ 2024-01-12 11:24:53.307359 │  40 │
│ 2514255 │ 2024-01-12 11:24:53.307359 │ 999 │
│ 2514256 │ 2024-01-12 11:24:54.416254 │  20 │
│ 2514257 │ 2024-01-12 11:24:54.416254 │ 999 │
│ 2514258 │ 2024-01-12 11:24:55.626889 │  49 │
│ 2514259 │ 2024-01-12 11:24:55.626889 │ 999 │
│ 2514260 │ 2024-01-12 11:24:56.838941 │  35 │
│ 2514261 │ 2024-01-12 11:24:56.838941 │ 999 │
│ 2514262 │ 2024-01-12 11:24:58.05045  │  98 │
│ 2514263 │ 2024-01-12 11:24:58.05045  │ 999 │
└─────────┴────────────────────────────┴─────┘
(10 lignes)

I expected the insert_date of each pair of id (2 by 2) has 1 second difference at least. But shows they have the same value, so I assume the 2 inserts are executed in an unique transaction, while I expected the Exec() function do to what's in the comment.

Or maybe I misunderstood something here, or misused something?

Thanks for any help on this !

It does do what it says, but the simple protocol can be weird. See https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-MULTI-STATEMENT for some of the details. Even though transaction control statements are allowed, the individual statements are still tied together to some degree - for example, an error stops execution - you can't rollback. And while it's not directly documented, it appears that CURRENT_TIMESTAMP gets the "beginning of transaction time" from the beginning of the implicit transaction started by the simple protocol execution not the explicit transaction created inside.

As an aside, this behavior is difficult to observe or test because even though psql uses the simple protocol it parses SQL strings before sending and splits each query text into its own query protocol message.

Thank you very much @jackc for your explanations, and, obviously for all the work done, it is very helpful.
Best,
Jean-Paul