Patroni node is Leader while PostgreSQL is still in read-only mode (not promoted yet)

Question

Patroni node is Leader while PostgreSQL is still in read-only mode (not promoted yet)

fr-br opened this issue 6 months ago · comments

What happened?

I have a 3 node setup with sync standby node. Performing 'patronictl switchover' to the sync standby node worked fine with Patroni 2.1.1, Consul 1.10.12 and PostgreSQL 11.19.
I upgraded to Patroni 3.2.0, Consul 1.17.0 and PostgreSQL 15.5 and now 'patronictl switchover' gives following behaviour:

Sync standby node gets the leader lock and hence becomes leader
A message is logged 'Postponing promotion because synchronous replication state was updated by somebody else'
For 'loop_wait' time (10s in our case), the leader node is responding OK on GET /leader call but PostgreSQL is not promoted yet
During this 'loop_wait' time, any write call towards the database fails with exception stating updates/inserts cannot be executed in read-only transaction (obviously as PostgreSQL is not promoted yet).

How can we reproduce it (as minimally and precisely as possible)?

Have a 3 node setup with sync standby node with Patroni 3.2.0, Consul 1.17.0 and PostgreSQL 15.5)
Perform 'patronictl switchover' from the current leader node (problem does not happen each time, so might need to do it several times)
Check patroni logs for 'Postponing promotion because synchronous replication state was updated by somebody else'
If possible have software performing update to the DB as soon as the sync standby node becomes leader.

What did you expect to happen?

First: Why do I see this message 'Postponing promotion because synchronous replication state was updated by somebody else'? As far as I know, the only software updating the synchronous replication state is Patroni.
Second: I would expect that when a node gets the leader lock, the PostgreSQL instance is always accepting write calls.

Patroni/PostgreSQL/DCS version

Patroni version: 3.2.0
PostgreSQL version: 15.5
DCS (and its version): Consul 1.17.0

Patroni configuration file

scope: fbc
name: node1

restapi:
  listen: 0.0.0.0:8008
  connect_address: 192.168.1.11:8008

consul:
  host: 127.0.0.1:8500
    
log:
  level: INFO

bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
    synchronous_mode: true
    synchronous_mode_strict: false
    postgresql:
      use_pg_rewind: false
      use_slots: false
      remove_data_directory_on_diverged_timelines: true
      parameters:
        wal_level: "replica"
        hot_standby: "on"
        wal_keep_segments: 64
        max_wal_senders: 10
        max_replication_slots: 10
        wal_log_hints: "on"
        max_connections: 200
        synchronous_commit: "on"
        synchronous_standby_names: "*"

  initdb:
    - encoding: UTF8
    - data-checksums
    - auth: md5

  pg_hba:
    - host replication replicator 192.168.1.11/32 md5
    - host replication replicator 192.168.1.12/32 md5
    - host replication replicator 192.168.1.13/32 md5
    - host all rewind_user 192.168.1.11/32 md5
    - host all rewind_user 192.168.1.12/32 md5
    - host all rewind_user 192.168.1.13/32 md5

  users:
    fbc:
      password: ********
      options:
        - createrole
        - login
        - superuser

postgresql:
  listen: 0.0.0.0:5432
  connect_address: 192.168.1.11:5432
  data_dir: /var/lib/pgsql/15/data
  bin_dir: /usr/pgsql-15/bin
  pgpass: /tmp/pgpass
  authentication:
    replication:
      username: replicator
      password: ********
    superuser:
      username: postgres
      password: ********
    rewind:
      username: rewind_user
      password: ********
  parameters:
    autovacuum_vacuum_scale_factor: 0.02
    autovacuum_vacuum_cost_limit: 1000
    autovacuum_vacuum_cost_delay: "10ms"
    shared_buffers: "1536MB"
    checkpoint_completion_target: 0.9
    from_collapse_limit: 2
    join_collapse_limit: 1

  basebackup:
    - verbose
    - max-rate: '100M'
    - checkpoint: 'fast'

tags:
  nofailover: false
  noloadbalance: false
  clonefrom: false
  nosync: false

patronictl show-config

loop_wait: 10
maximum_lag_on_failover: 1048576
postgresql:
  parameters:
    hot_standby: 'on'
    max_connections: 200
    max_replication_slots: 10
    max_wal_senders: 10
    synchronous_commit: 'on'
    synchronous_standby_names: '*'
    wal_keep_segments: 64
    wal_level: replica
    wal_log_hints: 'on'
  remove_data_directory_on_diverged_timelines: true
  use_pg_rewind: false
  use_slots: false
retry_timeout: 10
synchronous_mode: true
synchronous_mode_strict: false
ttl: 30

Patroni log files

patroni: 2023-12-05 09:55:45,080 INFO: no action. I am (node2), a secondary, and following a leader (node1)
patroni: 2023-12-05 09:55:53,475 INFO: Cleaning up failover key after acquiring leader lock...
patroni: 2023-12-05 09:55:53,491 INFO: Postponing promotion because synchronous replication state was updated by somebody else
patroni: 2023-12-05 09:56:03,470 INFO: Lock owner: node2; I am node2
patroni: server promoting
patroni: 2023-12-05 09:56:03,486 INFO: promoted self to leader because I had the session lock
patroni: 2023-12-05 09:56:04,508 INFO: Lock owner: node2; I am node2
patroni: 2023-12-05 09:56:04,521 INFO: Assigning synchronous standby status to ['node3']
patroni: server signaled

PostgreSQL log files

2023-12-05 09:55:52.465 CET [9178] LOG:  replication terminated by primary server
2023-12-05 09:55:52.465 CET [9178] DETAIL:  End of WAL reached on timeline 15 at 0/DE08820.
2023-12-05 09:55:52.466 CET [9178] FATAL:  could not send end-of-streaming message to primary: server closed the connection unexpectedly
                This probably means the server terminated abnormally
                before or while processing the request.
        no COPY in progress
2023-12-05 09:55:52.466 CET [11245] LOG:  invalid record length at 0/DE08820: wanted 24, got 0
2023-12-05 09:55:52.478 CET [16975] FATAL:  could not connect to the primary server: connection to server at "192.168.1.11", port 5432 failed: server closed the connection unexpectedly
                This probably means the server terminated abnormally
                before or while processing the request.
2023-12-05 09:55:52.478 CET [11245] LOG:  waiting for WAL to become available at 0/DE08838
2023-12-05 09:55:56.043 CET [17065] ERROR:  cannot execute UPDATE in a read-only transaction
2023-12-05 09:55:56.043 CET [17065] STATEMENT:  UPDATE public.databasechangeloglock SET LOCKED = TRUE, LOCKEDBY = '2001:420:4511:2016:250:56ff:fea3:230d%ens160 (2001:420:4511:2016:250:56ff:fea3:230d%ens160)', LOCKGRANTED = NOW() WHERE ID = 1 AND LOCKED = FALSE
2023-12-05 09:55:57.477 CET [17073] LOG:  started streaming WAL from primary at 0/D000000 on timeline 15
2023-12-05 09:56:03.490 CET [11245] LOG:  received promote request
2023-12-05 09:56:03.490 CET [17073] FATAL:  terminating walreceiver process due to administrator command
2023-12-05 09:56:03.490 CET [11245] LOG:  redo done at 0/DE087A8 system usage: CPU: user: 0.30 s, system: 0.30 s, elapsed: 58556.67 s
2023-12-05 09:56:03.490 CET [11245] LOG:  last completed transaction was at log time 2023-12-05 09:00:23.155728+01
2023-12-05 09:56:03.492 CET [11245] LOG:  selected new timeline ID: 16
2023-12-05 09:56:03.665 CET [11245] LOG:  archive recovery complete
2023-12-05 09:56:03.853 CET [11243] LOG:  checkpoint starting: force
2023-12-05 09:56:03.855 CET [11240] LOG:  database system is ready to accept connections
2023-12-05 09:56:03.865 CET [11243] LOG:  checkpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.005 s, sync=0.006 s, total=0.013 s; sync files=2, longest=0.005 s, average=0.003 s; distance=0 kB, estimate=1389 kB
2023-12-05 09:56:04.834 CET [11240] LOG:  received SIGHUP, reloading configuration files
2023-12-05 09:56:04.834 CET [11240] LOG:  parameter "synchronous_standby_names" changed to "node3"
2023-12-05 09:56:05.038 CET [17082] LOG:  standby "node3" is now a synchronous standby with priority 1

Have you tried to use GitHub issue search?

Yes

Anything else we need to know?

No response

Alexander Kukushkin · Answer 1 · Tue Dec 05 2023 18:29:11 GMT+0800 (China Standard Time)

First: Why do I see this message 'Postponing promotion because synchronous replication state was updated by somebody else'? As far as I know, the only software updating the synchronous replication state is Patroni.

Patroni isn't a single process, but it runs on every node independently, e.g., we are dealing with a distributed system, and "someone else" could be the for example old leader, due to a bug.

Second: I would expect that when a node gets the leader lock, the PostgreSQL instance is always accepting write calls.

Sorry, but you have wrong expectations. Between acquiring the leader lock and promoting postgres some additional actions or checks could be executed. In case if synchronous replication is enabled the new leader need to update /sync key by writing it's own name to it. Before such update succeeded Postgres will not be promoted. Another example is postgresql.pre_promote hook, which is executed between the leader lock is acquired and pg_ctl promote is executed. Moreover, if the pre_promote scripts exits with non-zero code, promote will not be performed. And last, but not least, even if pg_ctl promote is immediately executed after acquiring the leader lock Postgres doesn't immediately switch to read-write. Normally it takes a few dozens millisecond for promote to complete, but sometimes it could take seconds or even minutes (depending on your configuration).

Alexander Kukushkin · Answer 2 · Tue Dec 05 2023 18:35:30 GMT+0800 (China Standard Time)

@fr-br could you please check and post here Patroni logs from the former leader between 2023-12-05 09:55:45 and 2023-12-05 09:56:04?

Also, please give a bit more details how exactly the switchover was executed (maybe copy&paste logs from the terminal).

Alexander Kukushkin · Answer 3 · Tue Dec 05 2023 18:38:37 GMT+0800 (China Standard Time)

And ideally, if you can reproduce it with DEBUG logs enabled:

log:
  level: DEBUG

fr-br · Answer 4 · Tue Dec 05 2023 18:40:36 GMT+0800 (China Standard Time)

Thank you for the quick response. Is there another rest api call that returns only OK when Postgres has been promoted? Maybe GET /read-write?
I'll reproduce the problem with DEBUG and provide you the logs.

Alexander Kukushkin · Answer 5 · Tue Dec 05 2023 18:49:53 GMT+0800 (China Standard Time)

. Is there another rest api call that returns only OK when Postgres has been promoted?-

No. The thing is that load-balancer also doesn't immediately starts sending traffic to the backend with successful health-check. It happens only after a few consecutive health-checks are successful. That is, it takes at least a few seconds and usually at this time Postgres promotion already finished.

Patroni 3.2.0

Hmm. There was one bugfix in 3.2.1 that might help you.

fr-br · Answer 6 · Tue Dec 05 2023 20:38:06 GMT+0800 (China Standard Time)

Here's the output of a run with DEBUG with Patroni 3.2.0. I will now update to 3.2.1 and check if I can reproduce the problem.

Output of the switchover:

$ patronictl list
+ Cluster: fbc (7309061372335361013) --+-----------+----+-----------+
| Member | Host         | Role         | State     | TL | Lag in MB |
+--------+--------------+--------------+-----------+----+-----------+
| node1  | 192.168.1.11 | Leader       | running   |  2 |           |
| node2  | 192.168.1.12 | Sync Standby | streaming |  2 |         0 |
| node3  | 192.168.1.13 | Replica      | streaming |  2 |         0 |
+--------+--------------+--------------+-----------+----+-----------+
$ patronictl switchover
Current cluster topology
+ Cluster: fbc (7309061372335361013) --+-----------+----+-----------+
| Member | Host         | Role         | State     | TL | Lag in MB |
+--------+--------------+--------------+-----------+----+-----------+
| node1  | 192.168.1.11 | Leader       | running   |  2 |           |
| node2  | 192.168.1.12 | Sync Standby | streaming |  2 |         0 |
| node3  | 192.168.1.13 | Replica      | streaming |  2 |         0 |
+--------+--------------+--------------+-----------+----+-----------+
Primary [node1]:
Candidate ['node2', 'node3'] []: node2
When should the switchover take place (e.g. 2023-12-05T14:03 )  [now]:
Are you sure you want to switchover cluster fbc, demoting current leader node1? [y/N]: y
2023-12-05 13:04:03.50386 Successfully switched over to "node2"
+ Cluster: fbc (7309061372335361013) ---------+----+-----------+
| Member | Host         | Role    | State     | TL | Lag in MB |
+--------+--------------+---------+-----------+----+-----------+
| node1  | 192.168.1.11 | Replica | stopped   |    |   unknown |
| node2  | 192.168.1.12 | Leader  | running   |  2 |           |
| node3  | 192.168.1.13 | Replica | streaming |  2 |         0 |
+--------+--------------+---------+-----------+----+-----------+

Log of node1:

patroni: 2023-12-05 13:04:01,377 INFO: received switchover request with leader=node1 candidate=node2 scheduled_at=None
patroni: 2023-12-05 13:04:01,380 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:01,380 DEBUG: Starting new HTTP connection (1): 192.168.1.12:8008
patroni: 2023-12-05 13:04:01,384 DEBUG: http://192.168.1.12:8008 "GET /patroni HTTP/1.1" 200 None
patroni: 2023-12-05 13:04:01,384 INFO: Got response from node2 http://192.168.1.12:8008/patroni: {"state": "running", "postmaster_start_time": "2023-12-05 11:51:10.221294+01:00", "role": "replica", "server_version": 150005, "xlog": {"received_location": 113597736, "replayed_location": 113597736, "replayed_timestamp": "2023-12-05 13:03:07.938093+01:00", "paused": false}, "sync_standby": true, "timeline": 2, "replication_state": "streaming", "dcs_last_seen": 1701777839, "database_system_identifier": "7309061372335361013", "patroni": {"version": "3.2.0", "scope": "fbc", "name": "node2"}}
patroni: 2023-12-05 13:04:01,480 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:01,485 DEBUG: http://127.0.0.1:8500 "PUT /v1/kv/service/fbc/failover HTTP/1.1" 200 4
patroni: 2023-12-05 13:04:01,486 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:01,487 DEBUG: http://127.0.0.1:8500 "GET /v1/kv/service/fbc/?recurse=1&stale=1 HTTP/1.1" 200 None
patroni: 2023-12-05 13:04:01,488 INFO: Lock owner: node1; I am node1
patroni: 2023-12-05 13:04:01,490 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:01,491 DEBUG: http://127.0.0.1:8500 "PUT /v1/session/renew/6d83a199-aa58-5eed-c9e8-20709c15e5cb HTTP/1.1" 200 272
patroni: 2023-12-05 13:04:01,492 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:01,492 DEBUG: Resetting dropped connection: 192.168.1.12
patroni: 2023-12-05 13:04:01,495 DEBUG: http://192.168.1.12:8008 "GET /patroni HTTP/1.1" 200 None
patroni: 2023-12-05 13:04:01,495 INFO: Got response from node2 http://192.168.1.12:8008/patroni: {"state": "running", "postmaster_start_time": "2023-12-05 11:51:10.221294+01:00", "role": "replica", "server_version": 150005, "xlog": {"received_location": 113597736, "replayed_location": 113597736, "replayed_timestamp": "2023-12-05 13:03:07.938093+01:00", "paused": false}, "sync_standby": true, "timeline": 2, "replication_state": "streaming", "dcs_last_seen": 1701777839, "database_system_identifier": "7309061372335361013", "patroni": {"version": "3.2.0", "scope": "fbc", "name": "node2"}}
patroni: 2023-12-05 13:04:01,571 DEBUG: API thread: 127.0.0.1 - - "GET /leader HTTP/1.1" 200 - latency: 2.485 ms
patroni: 2023-12-05 13:04:01,593 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:01,598 DEBUG: http://127.0.0.1:8500 "PUT /v1/kv/service/fbc/members/node1?acquire=6d83a199-aa58-5eed-c9e8-20709c15e5cb HTTP/1.1" 200 4
patroni: 2023-12-05 13:04:01,599 INFO: switchover: demoting myself
patroni: 2023-12-05 13:04:01,599 INFO: Demoting self (graceful)
patroni: 2023-12-05 13:04:02,488 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:02,491 DEBUG: http://127.0.0.1:8500 "GET /v1/kv/service/fbc/?recurse=1&stale=1 HTTP/1.1" 200 None
patroni: 2023-12-05 13:04:02,574 DEBUG: API thread: 127.0.0.1 - - "GET /leader HTTP/1.1" 200 - latency: 1.360 ms
patroni: 2023-12-05 13:04:02,974 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:02,981 DEBUG: http://127.0.0.1:8500 "PUT /v1/kv/service/fbc/status HTTP/1.1" 200 4
patroni: 2023-12-05 13:04:02,982 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:02,992 DEBUG: http://127.0.0.1:8500 "DELETE /v1/kv/service/fbc/leader?cas=960 HTTP/1.1" 200 4
patroni: 2023-12-05 13:04:02,993 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:02,993 DEBUG: http://127.0.0.1:8500 "PUT /v1/session/renew/6d83a199-aa58-5eed-c9e8-20709c15e5cb HTTP/1.1" 200 272
patroni: 2023-12-05 13:04:02,994 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:02,999 DEBUG: http://127.0.0.1:8500 "PUT /v1/kv/service/fbc/members/node1?acquire=6d83a199-aa58-5eed-c9e8-20709c15e5cb HTTP/1.1" 200 4
patroni: 2023-12-05 13:04:02,999 INFO: Leader key released
patroni: 2023-12-05 13:04:03,494 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:03,497 DEBUG: http://127.0.0.1:8500 "GET /v1/kv/service/fbc/?recurse=1&stale=1 HTTP/1.1" 200 None
patroni: 2023-12-05 13:04:03,498 DEBUG: API thread: 192.168.1.11 - - "POST /switchover HTTP/1.1" 200 - latency: 2123.736 ms
patroni: 2023-12-05 13:04:03,577 DEBUG: API thread: 127.0.0.1 - - "GET /leader HTTP/1.1" 503 - latency: 1.268 ms
patroni: 2023-12-05 13:04:03,580 DEBUG: API thread: 127.0.0.1 - - "GET /replica HTTP/1.1" 503 - latency: 0.784 ms
patroni: 2023-12-05 13:04:05,002 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:05,004 DEBUG: http://127.0.0.1:8500 "GET /v1/kv/service/fbc/?recurse=1&stale=1 HTTP/1.1" 200 None
patroni: 2023-12-05 13:04:05,011 INFO: Local timeline=2 lsn=0/6C55DD8
patroni: 2023-12-05 13:04:05,013 INFO: closed patroni connections to postgres
patroni: 2023-12-05 13:04:05,336 DEBUG: Starting postgres: /usr/pgsql-15/bin/postgres -D /var/lib/pgsql/15/data --config-file=/var/lib/pgsql/15/data/postgresql.conf --listen_addresses=0.0.0.0 --port=5432 --cluster_name=fbc --wal_level=replica --hot_standby=on --max_connections=200 --max_wal_senders=10 --max_prepared_transactions=0 --max_locks_per_transaction=64 --track_commit_timestamp=off --max_replication_slots=10 --max_worker_processes=8 --wal_log_hints=on
patroni: 2023-12-05 13:04:05,858 INFO: postmaster pid=2490
patroni: localhost:5432 - no response
patroni: 2023-12-05 13:04:05.917 CET [2490] LOG:  redirecting log output to logging collector process
patroni: 2023-12-05 13:04:05.917 CET [2490] HINT:  Future log output will appear in directory "log".
patroni: localhost:5432 - accepting connections
patroni: localhost:5432 - accepting connections
patroni: 2023-12-05 13:04:08,616 INFO: establishing a new patroni heartbeat connection to postgres
patroni: 2023-12-05 13:04:08,625 ERROR: get_postgresql_status
patroni: Traceback (most recent call last):
patroni: File "/usr/local/lib/python3.6/site-packages/patroni/postgresql/connection.py", line 73, in query
patroni: cursor.execute(sql.encode('utf-8'), params or None)
patroni: psycopg2.OperationalError: server closed the connection unexpectedly
patroni: This probably means the server terminated abnormally
patroni: before or while processing the request.
patroni: The above exception was the direct cause of the following exception:
patroni: Traceback (most recent call last):
patroni: File "/usr/local/lib/python3.6/site-packages/patroni/api.py", line 1279, in get_postgresql_status
patroni: postgresql.wal_flush), retry=retry)[0]
patroni: File "/usr/local/lib/python3.6/site-packages/patroni/api.py", line 1211, in query
patroni: return self.server.query(sql, *params)
patroni: File "/usr/local/lib/python3.6/site-packages/patroni/api.py", line 1412, in query
patroni: return connection.query(sql, *params)
patroni: File "/usr/local/lib/python3.6/site-packages/patroni/postgresql/connection.py", line 84, in query
patroni: raise PostgresConnectionException('connection problems') from exc
patroni: patroni.exceptions.PostgresConnectionException: connection problems
patroni: 2023-12-05 13:04:08,627 DEBUG: API thread: 127.0.0.1 - - "GET /leader HTTP/1.1" 503 - latency: 11.242 ms
patroni: 2023-12-05 13:04:08,628 INFO: establishing a new patroni restapi connection to postgres
patroni: 2023-12-05 13:04:08,639 DEBUG: API thread: 127.0.0.1 - - "GET /replica HTTP/1.1" 200 - latency: 11.589 ms
patroni: 2023-12-05 13:04:10,114 DEBUG: API thread: 127.0.0.1 - - "GET /health HTTP/1.1" 200 - latency: 2.320 ms
patroni: 2023-12-05 13:04:11,487 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:11,489 DEBUG: http://127.0.0.1:8500 "GET /v1/kv/service/fbc/?recurse=1&stale=1 HTTP/1.1" 200 None
patroni: 2023-12-05 13:04:11,492 INFO: Lock owner: node2; I am node1
patroni: 2023-12-05 13:04:11,492 DEBUG: does not have lock
patroni: 2023-12-05 13:04:11,502 INFO: Local timeline=2 lsn=0/6C55E50
patroni: 2023-12-05 13:04:11,508 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:11,515 DEBUG: http://127.0.0.1:8500 "PUT /v1/kv/service/fbc/members/node1?acquire=6d83a199-aa58-5eed-c9e8-20709c15e5cb HTTP/1.1" 200 4
patroni: 2023-12-05 13:04:11,515 INFO: no action. I am (node1), a secondary, and following a leader (node2)
patroni: 2023-12-05 13:04:11,516 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:16,846 DEBUG: http://127.0.0.1:8500 "GET /v1/kv/service/fbc/leader?index=1047&wait=9.970305442810059s&stale=1 HTTP/1.1" 404 0
patroni: 2023-12-05 13:04:16,847 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:16,848 DEBUG: http://127.0.0.1:8500 "GET /v1/kv/service/fbc/?recurse=1&stale=1 HTTP/1.1" 200 None
patroni: 2023-12-05 13:04:16,851 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)

Log of node2:

patroni: 2023-12-05 13:03:59,347 INFO: Lock owner: node1; I am node2
patroni: 2023-12-05 13:03:59,347 DEBUG: does not have lock
patroni: 2023-12-05 13:03:59,349 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:03:59,350 DEBUG: http://127.0.0.1:8500 "PUT /v1/session/renew/4a4916c0-1252-e818-a532-2952b89af54e HTTP/1.1" 200 272
patroni: 2023-12-05 13:03:59,351 INFO: no action. I am (node2), a secondary, and following a leader (node1)
patroni: 2023-12-05 13:03:59,351 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:03:59,797 DEBUG: API thread: 127.0.0.1 - - "GET /leader HTTP/1.1" 503 - latency: 1.624 ms
patroni: 2023-12-05 13:03:59,799 DEBUG: API thread: 127.0.0.1 - - "GET /replica HTTP/1.1" 200 - latency: 1.184 ms
patroni: 2023-12-05 13:04:00,803 DEBUG: API thread: 127.0.0.1 - - "GET /leader HTTP/1.1" 503 - latency: 1.471 ms
patroni: 2023-12-05 13:04:00,805 DEBUG: API thread: 127.0.0.1 - - "GET /replica HTTP/1.1" 200 - latency: 0.947 ms
patroni: 2023-12-05 13:04:01,383 DEBUG: API thread: 192.168.1.11 - - "GET /patroni HTTP/1.1" 200 - latency: 1.618 ms
patroni: 2023-12-05 13:04:01,494 DEBUG: API thread: 192.168.1.11 - - "GET /patroni HTTP/1.1" 200 - latency: 1.402 ms
patroni: 2023-12-05 13:04:01,810 DEBUG: API thread: 127.0.0.1 - - "GET /leader HTTP/1.1" 503 - latency: 1.665 ms
patroni: 2023-12-05 13:04:01,812 DEBUG: API thread: 127.0.0.1 - - "GET /replica HTTP/1.1" 200 - latency: 1.067 ms
patroni: 2023-12-05 13:04:02,825 DEBUG: API thread: 127.0.0.1 - - "GET /leader HTTP/1.1" 503 - latency: 1.950 ms
patroni: 2023-12-05 13:04:02,828 DEBUG: API thread: 127.0.0.1 - - "GET /replica HTTP/1.1" 200 - latency: 1.329 ms
patroni: 2023-12-05 13:04:02,998 DEBUG: http://127.0.0.1:8500 "GET /v1/kv/service/fbc/leader?index=960&wait=9.858710050582886s&stale=1 HTTP/1.1" 404 0
patroni: 2023-12-05 13:04:02,999 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:03,000 DEBUG: http://127.0.0.1:8500 "GET /v1/kv/service/fbc/?recurse=1&stale=1 HTTP/1.1" 200 None
patroni: 2023-12-05 13:04:03,002 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:03,003 DEBUG: http://127.0.0.1:8500 "PUT /v1/session/renew/4a4916c0-1252-e818-a532-2952b89af54e HTTP/1.1" 200 272
patroni: 2023-12-05 13:04:03,004 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:03,008 DEBUG: http://127.0.0.1:8500 "PUT /v1/kv/service/fbc/leader?acquire=4a4916c0-1252-e818-a532-2952b89af54e HTTP/1.1" 200 4
patroni: 2023-12-05 13:04:03,008 INFO: Cleaning up failover key after acquiring leader lock...
patroni: 2023-12-05 13:04:03,008 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:03,013 DEBUG: http://127.0.0.1:8500 "PUT /v1/kv/service/fbc/failover HTTP/1.1" 200 4
patroni: 2023-12-05 13:04:03,013 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:03,015 DEBUG: http://127.0.0.1:8500 "GET /v1/kv/service/fbc/?recurse=1&stale=1 HTTP/1.1" 200 None
patroni: 2023-12-05 13:04:03,016 WARNING: Could not activate Linux watchdog device: Can't open watchdog device: [Errno 2] No such file or directory: '/dev/watchdog'
patroni: 2023-12-05 13:04:03,016 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:03,021 DEBUG: http://127.0.0.1:8500 "PUT /v1/kv/service/fbc/sync?cas=974 HTTP/1.1" 200 4
patroni: 2023-12-05 13:04:03,021 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:03,022 DEBUG: http://127.0.0.1:8500 "GET /v1/kv/service/fbc/sync?stale=1 HTTP/1.1" 200 155
patroni: 2023-12-05 13:04:03,023 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:03,028 DEBUG: http://127.0.0.1:8500 "PUT /v1/kv/service/fbc/members/node2?acquire=4a4916c0-1252-e818-a532-2952b89af54e HTTP/1.1" 200 4
patroni: 2023-12-05 13:04:03,028 INFO: Postponing promotion because synchronous replication state was updated by somebody else
patroni: 2023-12-05 13:04:03,832 DEBUG: API thread: 127.0.0.1 - - "GET /leader HTTP/1.1" 200 - latency: 1.712 ms
patroni: 2023-12-05 13:04:04,458 DEBUG: API thread: 127.0.0.1 - - "GET /health HTTP/1.1" 200 - latency: 1.631 ms
patroni: 2023-12-05 13:04:12,999 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:13,000 DEBUG: http://127.0.0.1:8500 "GET /v1/kv/service/fbc/?recurse=1&stale=1 HTTP/1.1" 200 None
patroni: 2023-12-05 13:04:13,002 INFO: Lock owner: node2; I am node2
patroni: 2023-12-05 13:04:13,003 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:13,004 DEBUG: http://127.0.0.1:8500 "PUT /v1/session/renew/4a4916c0-1252-e818-a532-2952b89af54e HTTP/1.1" 200 272
patroni: 2023-12-05 13:04:13,005 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:13,010 DEBUG: http://127.0.0.1:8500 "PUT /v1/kv/service/fbc/status HTTP/1.1" 200 4
patroni: 2023-12-05 13:04:13,011 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:13,015 DEBUG: http://127.0.0.1:8500 "PUT /v1/kv/service/fbc/sync?cas=1049 HTTP/1.1" 200 4
patroni: 2023-12-05 13:04:13,015 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:13,016 DEBUG: http://127.0.0.1:8500 "GET /v1/kv/service/fbc/sync?stale=1 HTTP/1.1" 200 152
patroni: 2023-12-05 13:04:13,017 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:13,022 DEBUG: http://127.0.0.1:8500 "PUT /v1/kv/service/fbc/members/node2?acquire=4a4916c0-1252-e818-a532-2952b89af54e HTTP/1.1" 200 4
patroni: 2023-12-05 13:04:13,022 INFO: promoted self to leader because I had the session lock
patroni: server promoting
patroni: 2023-12-05 13:04:14,035 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:14,037 DEBUG: http://127.0.0.1:8500 "GET /v1/kv/service/fbc/?recurse=1&stale=1 HTTP/1.1" 200 None
patroni: 2023-12-05 13:04:14,038 INFO: Lock owner: node2; I am node2
patroni: 2023-12-05 13:04:14,040 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:14,042 DEBUG: http://127.0.0.1:8500 "PUT /v1/session/renew/4a4916c0-1252-e818-a532-2952b89af54e HTTP/1.1" 200 272
patroni: 2023-12-05 13:04:14,043 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:14,047 DEBUG: http://127.0.0.1:8500 "PUT /v1/kv/service/fbc/status HTTP/1.1" 200 4
patroni: 2023-12-05 13:04:14,052 INFO: Assigning synchronous standby status to ['node1']
patroni: server signaled
patroni: 2023-12-05 13:04:14,463 DEBUG: API thread: 127.0.0.1 - - "GET /health HTTP/1.1" 200 - latency: 2.575 ms
patroni: 2023-12-05 13:04:16,461 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:16,467 DEBUG: http://127.0.0.1:8500 "PUT /v1/kv/service/fbc/sync?cas=1049 HTTP/1.1" 200 4
patroni: 2023-12-05 13:04:16,468 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:16,469 DEBUG: http://127.0.0.1:8500 "GET /v1/kv/service/fbc/sync?stale=1 HTTP/1.1" 200 152
patroni: 2023-12-05 13:04:16,469 INFO: Synchronous replication key updated by someone else
patroni: 2023-12-05 13:04:16,470 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:16,475 DEBUG: http://127.0.0.1:8500 "PUT /v1/kv/service/fbc/history HTTP/1.1" 200 4
patroni: 2023-12-05 13:04:16,476 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0, status=None)
patroni: 2023-12-05 13:04:16,483 DEBUG: http://127.0.0.1:8500 "PUT /v1/kv/service/fbc/members/node2?acquire=4a4916c0-1252-e818-a532-2952b89af54e HTTP/1.1" 200 4
patroni: 2023-12-05 13:04:16,483 INFO: no action. I am (node2), the leader with the lock

fr-br · Answer 7 · Tue Dec 05 2023 21:03:19 GMT+0800 (China Standard Time)

Same issue with Patroni 3.2.1.
Anyhow I will take into account the fact that GET /leader OK does not mean immediate write access to the DB.
Thanks for the info!

Alexander Kukushkin · Answer 8 · Tue Dec 05 2023 22:36:29 GMT+0800 (China Standard Time)

@fr-br thank you for DEBUG logs, they are very helpful!
One last thing, may I ask you to apply the following patch and try to reproduce it once again?

index 27cab778..fe66c6d8 100644
--- a/patroni/dcs/consul.py
+++ b/patroni/dcs/consul.py
@@ -666,7 +666,7 @@ class Consul(AbstractDCS):
         if ret:  # We have no other choise, only read after write :(
             if not retry.ensure_deadline(0.5):
                 return False
-            _, ret = self.retry(self._client.kv.get, self.sync_path)
+            _, ret = self.retry(self._client.kv.get, self.sync_path, consistency='consistent')
             if ret and (ret.get('Value') or b'').decode('utf-8') == value:
                 return ret['ModifyIndex']
         return False

fr-br · Answer 9 · Tue Dec 05 2023 23:52:25 GMT+0800 (China Standard Time)

@CyberDem0n : the patch works fine. I've performed switch overs several times from different nodes and I didn't see the message 'Postponing promotion because synchronous replication state was updated by somebody else' anymore.
Any idea when a new release containing the patch would be available?
Many thanks!

Alexander Kukushkin · Answer 10 · Wed Dec 06 2023 00:19:29 GMT+0800 (China Standard Time)

We just did a release less than one week ago. So the next one not earlier than in January 2024.

fr-br · Answer 11 · Wed Dec 06 2023 00:44:54 GMT+0800 (China Standard Time)

Thank you for the info!