kellabyte / frenzy

Postgres wire protocol aware mirroring proxy

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PSQL listing tables has different result when proxied through frenzy.

kellabyte opened this issue · comments

Primary: localhost:5441 PostgreSQL 15.2
Mirror: localhost:5442 PostgreSQL 15.2
PSQL Version: psql (PostgreSQL) 13.3

CLI:

frenzy --listen :5432 \
    --primary postgresql://postgres:password@localhost:5441/postgres \
    --mirror postgresql://postgres:password@localhost:5442/postgres

Summary

When you run a \l list table command in PSQL we get slightly different results between running the query directly against the primary versus the proxied result from Frenzy.

Results direct from primary

PGPASSWORD=password psql -E -U postgres -h localhost -p 5441 -d test -c "\l"

********* QUERY **********
SELECT d.datname as "Name",
       pg_catalog.pg_get_userbyid(d.datdba) as "Owner",
       pg_catalog.pg_encoding_to_char(d.encoding) as "Encoding",
       d.datcollate as "Collate",
       d.datctype as "Ctype",
       pg_catalog.array_to_string(d.datacl, E'\n') AS "Access privileges"
FROM pg_catalog.pg_database d
ORDER BY 1;
**************************

                                 List of databases
   Name    |  Owner   | Encoding |  Collate   |   Ctype    |   Access privileges
-----------+----------+----------+------------+------------+-----------------------
 postgres  | postgres | UTF8     | en_US.utf8 | en_US.utf8 |
 template0 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
           |          |          |            |            | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
           |          |          |            |            | postgres=CTc/postgres
 test      | postgres | UTF8     | en_US.utf8 | en_US.utf8 |
(4 rows)

Results from frenzy

PGPASSWORD=password psql -E -U postgres -h localhost -p 5432 -d test -c "\l"

********* QUERY **********
SELECT d.datname as "Name",
       pg_catalog.pg_get_userbyid(d.datdba) as "Owner",
       pg_catalog.pg_encoding_to_char(d.encoding) as "Encoding",
       pg_catalog.array_to_string(d.datacl, '\n') AS "Access privileges"
FROM pg_catalog.pg_database d
ORDER BY 1;
**************************

                          List of databases
   Name    |  Owner   | Encoding |         Access privileges
-----------+----------+----------+------------------------------------
 postgres  | postgres | UTF8     |
 template0 | postgres | UTF8     | =c/postgres\npostgres=CTc/postgres
 template1 | postgres | UTF8     | =c/postgres\npostgres=CTc/postgres
 test      | postgres | UTF8     |
(4 rows)

Anomalies

It looks like the issued SQL query from psql differs on each connection!

Collate column missing.
Ctype column missing.
Access privileges column doesn't match.

Are you using a current version of psql and the same version of Postgres on both the primary and mirror servers?

I tested this locally against a primary and replica Postgres server and received consistent results between direct connection and frenzy. I am using the latest psql and Postgres versions though (15.x). My results vary from yours in that they also include an ICU Locale and Local Provider column, but these columns appear to be generated by the psql themselves.

Another thought - have you tried using tcpdump or something similar to analyze the PG wire protocol?

Assuming mac: sudo tcpdump -A -i any port 5432

During the connection handshake the server returns the client a bunch of information regarding the server version, encoding, etc. I'm wondering if the library you are using to accept incoming traffic is returning an old server_version or protocol version causing psql to omit the some of the fields as support for them was added in later Postgres versions.

This appears to be the psql client code which defines what query is sent for the "List DB's" command: https://github.com/postgres/postgres/blob/30a53b792959b36f07200dae246067b3adbcc0b9/src/bin/psql/describe.c#L917-L1006

You can see that there are variances in the query based on the server version. It also looks like this code was just changed in the last few weeks and may explain why I see different results than you do using the latest version.

@redwolf3 Thanks so much for the information and investigation! I need to dig into this but I did find this area of the Postgres wire protocol library I'm using. I wonder if this is related?

jeroenrinzema/psql-wire/handshake.go

@kellabyte I think that is likely part of the cause. The other part is likely the version of the psql library you are using.

If you look at the v14.7 release of Postgres / psql, it appears that it only requests the Collate and CType fields if the server version is defined and greater than 8.x:
https://github.com/postgres/postgres/blob/REL_14_7/src/bin/psql/describe.c#L1063-L1068

In the v15.2 release of Postgres / psql (that I am using), it appears it always requests the Collate and CTpe fields, regardless of the server version:
https://github.com/postgres/postgres/blob/REL_15_2/src/bin/psql/describe.c#L934-L935

I'm guessing you are using an older psql client / library version, and that this combined with the missing server_version in the mirror library parameters is causing the discrepancy you are seeing.

(fixed reference link for v14.7 in previous comment)

I updated the issue details to include the Postgres versions for the primary, mirrors and PSQL. You were right, I'm using an older psql (PostgreSQL) 13.3.

I experimented with setting the wire.Server.Version = 150002 which I think is the correct value for how Postgres does versioning.

PQserverVersion
Returns an integer representing the server version.

int PQserverVersion(const PGconn *conn);
Applications might use this function to determine the version of the database server they are connected to. The result is formed by multiplying the server's major version number by 10000 and adding the minor version number. For example, version 10.1 will be returned as 100001, and version 11.0 will be returned as 110000. Zero is returned if the connection is bad.

Prior to major version 10, PostgreSQL used three-part version numbers in which the first two parts together represented the major version. For those versions, PQserverVersion uses two digits for each part; for example version 9.1.5 will be returned as 90105, and version 9.2.0 will be returned as 90200.

Therefore, for purposes of determining feature compatibility, applications should divide the result of PQserverVersion by 100 not 10000 to determine a logical major version number. In all release series, only the last two digits differ between minor releases (bug-fix releases).

And I got the following correct output from psql (PostgreSQL) 13.3!

PGPASSWORD=password psql -E -U postgres -h localhost -p 5432 -d test -c "\l"
********* QUERY **********
SELECT d.datname as "Name",
       pg_catalog.pg_get_userbyid(d.datdba) as "Owner",
       pg_catalog.pg_encoding_to_char(d.encoding) as "Encoding",
       d.datcollate as "Collate",
       d.datctype as "Ctype",
       pg_catalog.array_to_string(d.datacl, E'\n') AS "Access privileges"
FROM pg_catalog.pg_database d
ORDER BY 1;
**************************

                                 List of databases
   Name    |  Owner   | Encoding |  Collate   |   Ctype    |   Access privileges
-----------+----------+----------+------------+------------+-----------------------
 postgres  | postgres | UTF8     | en_US.utf8 | en_US.utf8 |
 template0 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
           |          |          |            |            | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
           |          |          |            |            | postgres=CTc/postgres
 test      | postgres | UTF8     | en_US.utf8 | en_US.utf8 |
(4 rows)

I think the correct thing to do here would be to add a smart version detector. On startup connect to the primary first before opening the Postgres wire protocol listener and detect what version the primary is and use that as what the proxy announces. Thoughts?

Thanks so much for this investigation and the time you spent helping. It's teaching me a lot about what I need to record in bug investigations.

@kellabyte I could see arguments for a couple of options:

  1. Interrogate primary and use that as your protocol version (what you suggested)
  2. Interrogate both primary and mirror(s) and select the lowest version

Option 1 seems like the easiest and most logical option. It’s likely that any version you are testing as mirrors are going to be the same or newer as the primary version (patched, minor upgrade, major upgrade, etc.).

Option 2 seems like you would get the “best” compatibility, as your client will drop back the the lowest protocol version (which would typically be the most compatible, but may disable some features).

Given the use-case - I would suggest starting with option 1 and then make it configurable once you encounter a case where that doesn’t work.

(Assuming you were actually asking me, hahaha)

I was asking your opinion! I think adding a configuration for this would be great where you can put a hard coded value or choose primary or lowest but for now I added the autodetect from the primary. I will create a feature request issue to enable the other modes but for now this is good enough.

Thank you sooo much for spending the time to investigate and offer some ideas on how this version can be configured in the future.

Fixed in commit 3fd74df.