Patroni not making switchover in case of primary server's disk is full
udit47 opened this issue · comments
What happened?
I have 3 node clusters (2 pg servers & 3 etcd sharing). Recently Primary server faced an issue where server ran out of space on disk where wal files are stored & eventually postgresql crashed. But switchover did not happen.
At this time patronictl status showed me "Crashed" as the state of Leader and "Running" as the state of Replica. Eventually I cleared out the space and postgresql recovered from crash and started serving requests again.
How can we reproduce it (as minimally and precisely as possible)?
You could try with same setup as mine and try to generate as much wal that it ends up filling your disk and postgresql crashes due to that.
What did you expect to happen?
Switchover to standby should have happened
Patroni/PostgreSQL/DCS version
- Patroni version: 2.1.3
- PostgreSQL version: 14.10
- DCS (and its version): etcd 3.3.25 api version : 2
Patroni configuration file
restapi:
listen: <ip>:8008
connect_address: <ip>:8008
etcd:
host: <ip>:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
wal_level: replica
hot_standby: on
logging_collector: on
max_wal_senders: 5
max_replication_slots: 5
wal_log_hints: on
track_functions: all
max_connections: 1000
max_worker_processes: '8'
log_line_prefix: '%t [%p]: user=%u,db=%d,app=%a,client=%h'
log_destination: 'stderr'
log_filename: 'postgresql-%Y-%m-%d.log'
log_rotation_age: 1d
log_directory: '/var/log/postgresql'
initdb:
- encoding: UTF8
- data-checksums
pg_hba:
- host replication replicator 0.0.0.0/0 scram-sha-256
- host replication replicator 127.0.0.1/32 trust
- host all all 0.0.0.0/0 scram-sha-256
postgresql:
listen: <ip>:5432
connect_address: <ip>:5432
data_dir: "/dt/pgsql/14"
bin_dir: "/usr/lib/postgresql/14/bin"
pgpass: /tmp/pgpass0
basebackup:
- waldir: "/clog/pgsql/14"
parameters:
unix_socket_directories: '/var/run/postgresql'
archive_command: pgbackrest --stanza=bkp_stanza archive-push "%p"
archive_mode: "on"
log:
level: INFO
traceback_level: ERROR
patronictl show-config
loop_wait: 10
maximum_lag_on_failover: 1048576
postgresql:
parameters:
hot_standby: true
log_destination: stderr
log_directory: /var/log/postgresql
log_filename: postgresql-%Y-%m-%d.log
log_line_prefix: '%t [%p]: user=%u,db=%d,app=%a,client=%h'
log_rotation_age: 1d
logging_collector: true
max_connections: 1000
max_replication_slots: 5
max_wal_senders: 5
max_worker_processes: '8'
shared_preload_libraries: pglogical
track_functions: all
wal_level: logical
wal_log_hints: true
use_pg_rewind: true
use_slots: true
retry_timeout: 10
ttl: 30
Patroni log files
pg-node-0 (Leader)
2023-12-27 18:33:18,133 ERROR: get_postgresql_status
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/api.py", line 676, in query
cursor.execute(sql, params)
psycopg2.OperationalError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/api.py", line 610, in get_postgresql_status
row = self.query(stmt.format(postgresql.wal_name, postgresql.lsn_name), retry=retry)[0]
File "/usr/lib/python3/dist-packages/patroni/api.py", line 592, in query
return self.server.query(sql, *params)
File "/usr/lib/python3/dist-packages/patroni/api.py", line 681, in query
raise PostgresConnectionException('connection problems')
patroni.exceptions.PostgresConnectionException: 'connection problems'
2023-12-27 18:33:18,134 ERROR: get_postgresql_status
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/api.py", line 676, in query
cursor.execute(sql, params)
psycopg2.OperationalError: no connection to the server
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/api.py", line 610, in get_postgresql_status
row = self.query(stmt.format(postgresql.wal_name, postgresql.lsn_name), retry=retry)[0]
File "/usr/lib/python3/dist-packages/patroni/api.py", line 592, in query
return self.server.query(sql, *params)
File "/usr/lib/python3/dist-packages/patroni/api.py", line 681, in query
raise PostgresConnectionException('connection problems')
patroni.exceptions.PostgresConnectionException: 'connection problems'
2023-12-27 18:33:18,251 INFO: Lock owner: pg-node-0; I am pg-node-0
2023-12-27 18:33:18,251 INFO: establishing a new patroni connection to the postgres cluster
2023-12-27 18:33:19,660 ERROR: get_postgresql_status
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/api.py", line 675, in query
with self.patroni.postgresql.connection().cursor() as cursor:
File "/usr/lib/python3/dist-packages/patroni/postgresql/__init__.py", line 255, in connection
return self._connection.get()
File "/usr/lib/python3/dist-packages/patroni/postgresql/connection.py", line 24, in get
self._connection = psycopg.connect(**self._conn_kwargs)
File "/usr/lib/python3/dist-packages/psycopg2/__init__.py", line 122, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: connection to server at "<primary_ip>", port 5432 failed: FATAL: the database system is in recovery mode
2023-12-27 18:33:19,772 WARNING: Failed to determine PostgreSQL state from the connection, falling back to cached role
2023-12-27 18:33:19,772 ERROR: Exception when called state_handler.last_operation()
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/ha.py", line 157, in update_lock
last_lsn = self.state_handler.last_operation()
File "/usr/lib/python3/dist-packages/patroni/postgresql/__init__.py", line 992, in last_operation
return self._wal_position(self.is_leader(), self._cluster_info_state_get('wal_position'),
File "/usr/lib/python3/dist-packages/patroni/postgresql/__init__.py", line 354, in _cluster_info_state_get
raise PostgresConnectionException(self._cluster_info_state['error'])
patroni.exceptions.PostgresConnectionException: "'Too many retry attempts'"
2023-12-27 18:33:19,777 WARNING: Failed to determine PostgreSQL state from the connection, falling back to cached role
2023-12-27 18:33:19,780 INFO: Error communicating with PostgreSQL. Will try again later
2023-12-27 18:33:28,259 WARNING: Postgresql is not running.
2023-12-27 18:33:28,259 INFO: Lock owner: pg-node-0; I am pg-node-0
2023-12-27 18:33:28,275 INFO: doing crash recovery in a single user mode
2023-12-27 18:33:30,782 ERROR: Crash recovery finished with code=-6
2023-12-27 18:33:30,782 INFO: stdout=
PostgreSQL stand-alone backend 14.10 (Ubuntu 14.10-0ubuntu0.22.04.1)
backend>
2023-12-27 18:33:30,782 INFO: stderr=2023-12-27 18:33:28 IST [2444981]: user=,db=,app=,client=LOG: database system was interrupted while in recovery at 2023-12-27 18:33:20 IST
2023-12-27 18:33:28 IST [2444981]: user=,db=,app=,client=HINT: This probably means that some data is corrupted and you will have to use the last backup for recovery.
2023-12-27 18:33:29 IST [2444981]: user=,db=,app=,client=LOG: database system was not properly shut down; automatic recovery in progress
2023-12-27 18:33:29 IST [2444981]: user=,db=,app=,client=LOG: redo starts at 588/BA161170
2023-12-27 18:33:30 IST [2444981]: user=,db=,app=,client=LOG: redo done at 588/D7FFFF28 system usage: CPU: user: 0.87 s, system: 0.13 s, elapsed: 1.01 s
2023-12-27 18:33:30 IST [2444981]: user=,db=,app=,client=PANIC: could not write to file "pg_wal/xlogtemp.2444981": No space left on device
2023-12-27 18:33:38,251 WARNING: Postgresql is not running.
2023-12-27 18:33:38,251 INFO: Lock owner: pg-node-0; I am pg-node-0
2023-12-27 18:33:38,255 INFO: Lock owner: pg-node-0; I am pg-node-0
2023-12-27 18:33:38,258 INFO: starting as readonly because i had the session lock
2023-12-27 18:33:38,462 INFO: postmaster pid=2444999
2023-12-27 18:33:39,475 INFO: Lock owner: pg-node-0; I am pg-node-0
2023-12-27 18:33:39,475 INFO: establishing a new patroni connection to the postgres cluster
2023-12-27 18:33:39,490 INFO: promoted self to leader because I had the session lock
2023-12-27 18:33:39,492 INFO: cleared rewind state after becoming the leader
2023-12-27 18:33:40,692 ERROR: get_postgresql_status
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/api.py", line 610, in get_postgresql_status
row = self.query(stmt.format(postgresql.wal_name, postgresql.lsn_name), retry=retry)[0]
File "/usr/lib/python3/dist-packages/patroni/api.py", line 592, in query
return self.server.query(sql, *params)
File "/usr/lib/python3/dist-packages/patroni/api.py", line 680, in query
raise e
File "/usr/lib/python3/dist-packages/patroni/api.py", line 676, in query
cursor.execute(sql, params)
psycopg2.OperationalError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2023-12-27 18:33:40,692 ERROR: get_postgresql_status
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/api.py", line 676, in query
cursor.execute(sql, params)
psycopg2.OperationalError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2023-12-27 18:33:59,476 WARNING: Postgresql is not running.
2023-12-27 18:33:59,476 INFO: Lock owner: pg-node-0; I am pg-node-0
2023-12-27 18:33:59,479 INFO: pg_controldata:
pg_control version number: 1300
Catalog version number: 202107181
Database system identifier: 7277219729343958140
Database cluster state: in archive recovery
pg_control last modified: Wed Dec 27 18:33:38 2023
Latest checkpoint location: 588/D8000058
Latest checkpoint's REDO location: 588/D8000058
Latest checkpoint's REDO WAL file: 0000000100000588000000D8
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:166852302
Latest checkpoint's NextOID: 621136355
Latest checkpoint's NextMultiXactId: 17
Latest checkpoint's NextMultiOffset: 33
Latest checkpoint's oldestXID: 727
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 0
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid: 0
Latest checkpoint's newestCommitTsXid: 0
Time of latest checkpoint: Wed Dec 27 18:33:30 2023
Fake LSN counter for unlogged rels: 0/3E8
Minimum recovery ending location: 588/D9000000
Min recovery ending loc's timeline: 1
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: logical
wal_log_hints setting: on
max_connections setting: 1000
max_worker_processes setting: 8
max_wal_senders setting: 5
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: off
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float8 argument passing: by value
Data page checksum version: 1
Mock authentication nonce: 7ee5755629ccc8cba5a74cfde74fecfe6abb72475ab6043d0aa9f594ec494f3e
2023-12-27 18:33:59,480 INFO: Lock owner: pg-node-0; I am pg-node-0
2023-12-27 18:33:59,483 INFO: starting as readonly because i had the session lock
2023-12-27 18:33:59,582 INFO: postmaster pid=2445060
2023-12-27 18:34:00,595 INFO: Lock owner: pg-node-0; I am pg-node-0
2023-12-27 18:34:00,595 INFO: establishing a new patroni connection to the postgres cluster
2023-12-27 18:34:00,607 INFO: promoted self to leader because I had the session lock
2023-12-27 18:34:00,609 INFO: cleared rewind state after becoming the leader
2023-12-27 18:34:01,735 ERROR: get_postgresql_status
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/api.py", line 676, in query
cursor.execute(sql, params)
psycopg2.OperationalError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/api.py", line 610, in get_postgresql_status
row = self.query(stmt.format(postgresql.wal_name, postgresql.lsn_name), retry=retry)[0]
File "/usr/lib/python3/dist-packages/patroni/api.py", line 592, in query
return self.server.query(sql, *params)
File "/usr/lib/python3/dist-packages/patroni/api.py", line 681, in query
raise PostgresConnectionException('connection problems')
patroni.exceptions.PostgresConnectionException: 'connection problems'
2023-12-27 18:34:01,736 ERROR: get_postgresql_status
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/api.py", line 675, in query
with self.patroni.postgresql.connection().cursor() as cursor:
File "/usr/lib/python3/dist-packages/patroni/postgresql/__init__.py", line 255, in connection
return self._connection.get()
File "/usr/lib/python3/dist-packages/patroni/postgresql/connection.py", line 24, in get
self._connection = psycopg.connect(**self._conn_kwargs)
File "/usr/lib/python3/dist-packages/psycopg2/__init__.py", line 122, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/api.py", line 610, in get_postgresql_status
row = self.query(stmt.format(postgresql.wal_name, postgresql.lsn_name), retry=retry)[0]
File "/usr/lib/python3/dist-packages/patroni/api.py", line 592, in query
return self.server.query(sql, *params)
File "/usr/lib/python3/dist-packages/patroni/api.py", line 681, in query
raise PostgresConnectionException('connection problems')
patroni.exceptions.PostgresConnectionException: 'connection problems'
2023-12-27 18:34:10,596 INFO: Lock owner: pg-node-0; I am pg-node-0
2023-12-27 18:34:10,599 INFO: establishing a new patroni connection to the postgres cluster
2023-12-27 18:34:10,700 INFO: establishing a new patroni connection to the postgres cluster
2023-12-27 18:34:10,700 WARNING: Retry got exception: 'connection problems'
2023-12-27 18:34:10,706 INFO: updated leader lock during promote
2023-12-27 18:34:10,746 ERROR: get_postgresql_status
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/api.py", line 675, in query
with self.patroni.postgresql.connection().cursor() as cursor:
File "/usr/lib/python3/dist-packages/patroni/postgresql/__init__.py", line 255, in connection
return self._connection.get()
File "/usr/lib/python3/dist-packages/patroni/postgresql/connection.py", line 24, in get
self._connection = psycopg.connect(**self._conn_kwargs)
File "/usr/lib/python3/dist-packages/psycopg2/__init__.py", line 122, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/api.py", line 610, in get_postgresql_status
row = self.query(stmt.format(postgresql.wal_name, postgresql.lsn_name), retry=retry)[0]
File "/usr/lib/python3/dist-packages/patroni/api.py", line 592, in query
return self.server.query(sql, *params)
File "/usr/lib/python3/dist-packages/patroni/api.py", line 681, in query
raise PostgresConnectionException('connection problems')
patroni.exceptions.PostgresConnectionException: 'connection problems'
2023-12-27 18:34:10,746 ERROR: get_postgresql_status
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/api.py", line 675, in query
with self.patroni.postgresql.connection().cursor() as cursor:
File "/usr/lib/python3/dist-packages/patroni/postgresql/__init__.py", line 255, in connection
return self._connection.get()
File "/usr/lib/python3/dist-packages/patroni/postgresql/connection.py", line 24, in get
self._connection = psycopg.connect(**self._conn_kwargs)
File "/usr/lib/python3/dist-packages/psycopg2/__init__.py", line 122, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:34:20,600 INFO: Lock owner: pg-node-0; I am pg-node-0
2023-12-27 18:34:20,603 INFO: starting as readonly because i had the session lock
2023-12-27 18:34:20,701 INFO: postmaster pid=2445122
2023-12-27 18:34:21,715 INFO: Lock owner: pg-node-0; I am pg-node-0
2023-12-27 18:34:21,715 INFO: establishing a new patroni connection to the postgres cluster
2023-12-27 18:34:21,726 INFO: promoted self to leader because I had the session lock
2023-12-27 18:34:21,726 INFO: cleared rewind state after becoming the leader
2023-12-27 18:34:22,759 ERROR: get_postgresql_status
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/api.py", line 610, in get_postgresql_status
row = self.query(stmt.format(postgresql.wal_name, postgresql.lsn_name), retry=retry)[0]
File "/usr/lib/python3/dist-packages/patroni/api.py", line 592, in query
return self.server.query(sql, *params)
File "/usr/lib/python3/dist-packages/patroni/api.py", line 680, in query
raise e
File "/usr/lib/python3/dist-packages/patroni/api.py", line 676, in query
cursor.execute(sql, params)
psycopg2.OperationalError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2023-12-27 18:34:22,759 ERROR: get_postgresql_status
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/patroni/api.py", line 676, in query
cursor.execute(sql, params)
psycopg2.OperationalError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
pg-node-1 (Replica)
2023-12-27 18:33:18,778 INFO: Selected new etcd server http://172.22.46.174:2379
2023-12-27 18:33:18,795 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:33:19,823 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:33:28,283 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:33:38,275 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:33:39,493 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:33:49,494 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:33:59,502 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:34:00,613 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:34:10,615 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:34:20,620 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:34:21,732 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:34:31,734 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:34:41,739 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:34:42,860 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:34:52,860 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:35:02,865 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:35:03,979 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:35:13,979 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:35:23,985 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:35:25,101 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:35:35,100 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:35:45,107 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:35:46,231 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:35:56,222 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:36:06,228 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:36:07,350 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:36:17,342 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
2023-12-27 18:36:27,348 INFO: no action. I am (pg-node-1), a secondary, and following a leader (pg-node-0)
PostgreSQL log files
pg-node-0
2023-12-27 18:33:18 IST [2556450]: user=,db=,app=,client=LOG: all server processes terminated; reinitializing
2023-12-27 18:33:19 IST [2444952]: user=,db=,app=,client=LOG: database system was interrupted; last known up at 2023-12-27 18:24:32 IST
2023-12-27 18:33:19 IST [2444953]: user=postgres,db=postgres,app=[unknown],client=<primary_ip>FATAL: the database system is in recovery mode
2023-12-27 18:33:19 IST [2444956]: user=postgres,db=postgres,app=[unknown],client=<primary_ip>FATAL: the database system is in recovery mode
2023-12-27 18:33:19 IST [2444957]: user=postgres,db=postgres,app=[unknown],client=<primary_ip>FATAL: the database system is in recovery mode
2023-12-27 18:33:19 IST [2444958]: user=postgres,db=postgres,app=[unknown],client=<primary_ip>FATAL: the database system is in recovery mode
2023-12-27 18:33:19 IST [2444960]: user=postgres,db=postgres,app=[unknown],client=<primary_ip>FATAL: the database system is in recovery mode
2023-12-27 18:33:20 IST [2444952]: user=,db=,app=,client=LOG: database system was not properly shut down; automatic recovery in progress
2023-12-27 18:33:20 IST [2444952]: user=,db=,app=,client=LOG: redo starts at 588/BA161170
2023-12-27 18:33:21 IST [2444963]: user=postgres,db=postgres,app=[unknown],client=<primary_ip>FATAL: the database system is in recovery mode
2023-12-27 18:33:21 IST [2444964]: user=postgres,db=postgres,app=[unknown],client=<primary_ip>FATAL: the database system is in recovery mode
2023-12-27 18:33:21 IST [2444952]: user=,db=,app=,client=LOG: redo done at 588/D7FFFF28 system usage: CPU: user: 0.85 s, system: 0.16 s, elapsed: 1.03 s
2023-12-27 18:33:21 IST [2444952]: user=,db=,app=,client=PANIC: could not write to file "pg_wal/xlogtemp.2444952": No space left on device
2023-12-27 18:33:21 IST [2556450]: user=,db=,app=,client=LOG: startup process (PID 2444952) was terminated by signal 6: Aborted
2023-12-27 18:33:21 IST [2556450]: user=,db=,app=,client=LOG: aborting startup due to startup process failure
2023-12-27 18:33:21 IST [2556450]: user=,db=,app=,client=LOG: database system is shut down
2023-12-27 18:33:38 IST [2444999]: user=,db=,app=,client=LOG: starting PostgreSQL 14.10 (Ubuntu 14.10-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
2023-12-27 18:33:38 IST [2444999]: user=,db=,app=,client=LOG: listening on IPv4 address "<primary_ip>", port 5432
2023-12-27 18:33:38 IST [2444999]: user=,db=,app=,client=LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2023-12-27 18:33:38 IST [2445002]: user=,db=,app=,client=LOG: database system shutdown was interrupted; last known up at 2023-12-27 18:33:30 IST
2023-12-27 18:33:38 IST [2445002]: user=,db=,app=,client=WARNING: specified neither primary_conninfo nor restore_command
2023-12-27 18:33:38 IST [2445002]: user=,db=,app=,client=HINT: The database server will regularly poll the pg_wal subdirectory to check for files placed there.
2023-12-27 18:33:38 IST [2445002]: user=,db=,app=,client=LOG: entering standby mode
2023-12-27 18:33:38 IST [2445002]: user=,db=,app=,client=LOG: database system was not properly shut down; automatic recovery in progress
2023-12-27 18:33:38 IST [2445002]: user=,db=,app=,client=LOG: redo starts at 588/D80000D0
2023-12-27 18:33:38 IST [2445002]: user=,db=,app=,client=LOG: consistent recovery state reached at 588/D9000000
2023-12-27 18:33:38 IST [2444999]: user=,db=,app=,client=LOG: database system is ready to accept read-only connections
2023-12-27 18:33:39 IST [2445002]: user=,db=,app=,client=LOG: received promote request
2023-12-27 18:33:39 IST [2445002]: user=,db=,app=,client=LOG: redo done at 588/D80000D0 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.73 s
2023-12-27 18:33:39 IST [2445002]: user=,db=,app=,client=LOG: selected new timeline ID: 2
2023-12-27 18:33:39 IST [2445002]: user=,db=,app=,client=FATAL: could not write to file "pg_wal/xlogtemp.2445002": No space left on device
2023-12-27 18:33:39 IST [2444999]: user=,db=,app=,client=LOG: startup process (PID 2445002) exited with exit code 1
2023-12-27 18:33:39 IST [2444999]: user=,db=,app=,client=LOG: terminating any other active server processes
2023-12-27 18:33:39 IST [2444999]: user=,db=,app=,client=LOG: shutting down due to startup process failure
2023-12-27 18:33:39 IST [2444999]: user=,db=,app=,client=LOG: database system is shut down
2023-12-27 18:33:59 IST [2445060]: user=,db=,app=,client=LOG: starting PostgreSQL 14.10 (Ubuntu 14.10-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
2023-12-27 18:33:59 IST [2445060]: user=,db=,app=,client=LOG: listening on IPv4 address "<primary_ip>", port 5432
2023-12-27 18:33:59 IST [2445060]: user=,db=,app=,client=LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2023-12-27 18:33:59 IST [2445063]: user=,db=,app=,client=LOG: database system was interrupted while in recovery at log time 2023-12-27 18:33:30 IST
2023-12-27 18:33:59 IST [2445063]: user=,db=,app=,client=HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
2023-12-27 18:33:59 IST [2445063]: user=,db=,app=,client=WARNING: specified neither primary_conninfo nor restore_command
2023-12-27 18:33:59 IST [2445063]: user=,db=,app=,client=HINT: The database server will regularly poll the pg_wal subdirectory to check for files placed there.
2023-12-27 18:33:59 IST [2445063]: user=,db=,app=,client=LOG: entering standby mode
2023-12-27 18:33:59 IST [2445063]: user=,db=,app=,client=LOG: redo starts at 588/D80000D0
2023-12-27 18:33:59 IST [2445063]: user=,db=,app=,client=LOG: consistent recovery state reached at 588/D9000000
2023-12-27 18:33:59 IST [2445060]: user=,db=,app=,client=LOG: database system is ready to accept read-only connections
2023-12-27 18:34:00 IST [2445063]: user=,db=,app=,client=LOG: received promote request
2023-12-27 18:34:00 IST [2445063]: user=,db=,app=,client=LOG: redo done at 588/D80000D0 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.72 s
2023-12-27 18:34:00 IST [2445063]: user=,db=,app=,client=LOG: selected new timeline ID: 2
2023-12-27 18:34:00 IST [2445063]: user=,db=,app=,client=FATAL: could not write to file "pg_wal/xlogtemp.2445063": No space left on device
2023-12-27 18:34:00 IST [2445060]: user=,db=,app=,client=LOG: startup process (PID 2445063) exited with exit code 1
2023-12-27 18:34:00 IST [2445060]: user=,db=,app=,client=LOG: terminating any other active server processes
2023-12-27 18:34:00 IST [2445060]: user=,db=,app=,client=LOG: shutting down due to startup process failure
2023-12-27 18:34:00 IST [2445060]: user=,db=,app=,client=LOG: database system is shut down
2023-12-27 18:34:20 IST [2445122]: user=,db=,app=,client=LOG: starting PostgreSQL 14.10 (Ubuntu 14.10-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
2023-12-27 18:34:20 IST [2445122]: user=,db=,app=,client=LOG: listening on IPv4 address "<primary_ip>", port 5432
2023-12-27 18:34:20 IST [2445122]: user=,db=,app=,client=LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2023-12-27 18:34:20 IST [2445125]: user=,db=,app=,client=LOG: database system was interrupted while in recovery at log time 2023-12-27 18:33:30 IST
2023-12-27 18:34:20 IST [2445125]: user=,db=,app=,client=HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
2023-12-27 18:34:20 IST [2445125]: user=,db=,app=,client=WARNING: specified neither primary_conninfo nor restore_command
2023-12-27 18:34:20 IST [2445125]: user=,db=,app=,client=HINT: The database server will regularly poll the pg_wal subdirectory to check for files placed there.
2023-12-27 18:34:20 IST [2445125]: user=,db=,app=,client=LOG: entering standby mode
2023-12-27 18:34:20 IST [2445125]: user=,db=,app=,client=LOG: redo starts at 588/D80000D0
2023-12-27 18:34:20 IST [2445125]: user=,db=,app=,client=LOG: consistent recovery state reached at 588/D9000000
2023-12-27 18:34:21 IST [2445122]: user=,db=,app=,client=LOG: database system is ready to accept read-only connections
2023-12-27 18:34:21 IST [2445125]: user=,db=,app=,client=LOG: received promote request
2023-12-27 18:34:21 IST [2445125]: user=,db=,app=,client=LOG: redo done at 588/D80000D0 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.72 s
2023-12-27 18:34:21 IST [2445125]: user=,db=,app=,client=LOG: selected new timeline ID: 2
2023-12-27 18:34:21 IST [2445125]: user=,db=,app=,client=FATAL: could not write to file "pg_wal/xlogtemp.2445125": No space left on device
2023-12-27 18:34:21 IST [2445122]: user=,db=,app=,client=LOG: startup process (PID 2445125) exited with exit code 1
2023-12-27 18:34:21 IST [2445122]: user=,db=,app=,client=LOG: terminating any other active server processes
2023-12-27 18:34:21 IST [2445122]: user=,db=,app=,client=LOG: shutting down due to startup process failure
2023-12-27 18:34:21 IST [2445122]: user=,db=,app=,client=LOG: database system is shut down
2023-12-27 18:34:41 IST [2445182]: user=,db=,app=,client=LOG: starting PostgreSQL 14.10 (Ubuntu 14.10-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
2023-12-27 18:34:41 IST [2445182]: user=,db=,app=,client=LOG: listening on IPv4 address "<primary_ip>", port 5432
2023-12-27 18:34:41 IST [2445182]: user=,db=,app=,client=LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2023-12-27 18:34:41 IST [2445185]: user=,db=,app=,client=LOG: database system was interrupted while in recovery at log time 2023-12-27 18:33:30 IST
2023-12-27 18:34:41 IST [2445185]: user=,db=,app=,client=HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
2023-12-27 18:34:42 IST [2445185]: user=,db=,app=,client=WARNING: specified neither primary_conninfo nor restore_command
2023-12-27 18:34:42 IST [2445185]: user=,db=,app=,client=HINT: The database server will regularly poll the pg_wal subdirectory to check for files placed there.
2023-12-27 18:34:42 IST [2445185]: user=,db=,app=,client=LOG: entering standby mode
2023-12-27 18:34:42 IST [2445185]: user=,db=,app=,client=LOG: redo starts at 588/D80000D0
2023-12-27 18:34:42 IST [2445185]: user=,db=,app=,client=LOG: consistent recovery state reached at 588/D9000000
2023-12-27 18:34:42 IST [2445182]: user=,db=,app=,client=LOG: database system is ready to accept read-only connections
2023-12-27 18:34:42 IST [2445185]: user=,db=,app=,client=LOG: received promote request
2023-12-27 18:34:42 IST [2445185]: user=,db=,app=,client=LOG: redo done at 588/D80000D0 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.72 s
2023-12-27 18:34:42 IST [2445185]: user=,db=,app=,client=LOG: selected new timeline ID: 2
2023-12-27 18:34:42 IST [2445185]: user=,db=,app=,client=FATAL: could not write to file "pg_wal/xlogtemp.2445185": No space left on device
2023-12-27 18:34:42 IST [2445182]: user=,db=,app=,client=LOG: startup process (PID 2445185) exited with exit code 1
2023-12-27 18:34:42 IST [2445182]: user=,db=,app=,client=LOG: terminating any other active server processes
2023-12-27 18:34:42 IST [2445182]: user=,db=,app=,client=LOG: shutting down due to startup process failure
2023-12-27 18:34:42 IST [2445182]: user=,db=,app=,client=LOG: database system is shut down
2023-12-27 18:35:03 IST [2445247]: user=,db=,app=,client=LOG: starting PostgreSQL 14.10 (Ubuntu 14.10-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
2023-12-27 18:35:03 IST [2445247]: user=,db=,app=,client=LOG: listening on IPv4 address "<primary_ip>", port 5432
2023-12-27 18:35:03 IST [2445247]: user=,db=,app=,client=LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2023-12-27 18:35:03 IST [2445250]: user=,db=,app=,client=LOG: database system was interrupted while in recovery at log time 2023-12-27 18:33:30 IST
2023-12-27 18:35:03 IST [2445250]: user=,db=,app=,client=HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
2023-12-27 18:35:03 IST [2445250]: user=,db=,app=,client=WARNING: specified neither primary_conninfo nor restore_command
2023-12-27 18:35:03 IST [2445250]: user=,db=,app=,client=HINT: The database server will regularly poll the pg_wal subdirectory to check for files placed there.
2023-12-27 18:35:03 IST [2445250]: user=,db=,app=,client=LOG: entering standby mode
2023-12-27 18:35:03 IST [2445250]: user=,db=,app=,client=LOG: redo starts at 588/D80000D0
2023-12-27 18:35:03 IST [2445250]: user=,db=,app=,client=LOG: consistent recovery state reached at 588/D9000000
2023-12-27 18:35:03 IST [2445247]: user=,db=,app=,client=LOG: database system is ready to accept read-only connections
2023-12-27 18:35:03 IST [2445250]: user=,db=,app=,client=LOG: received promote request
2023-12-27 18:35:03 IST [2445250]: user=,db=,app=,client=LOG: redo done at 588/D80000D0 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.72 s
2023-12-27 18:35:03 IST [2445250]: user=,db=,app=,client=LOG: selected new timeline ID: 2
2023-12-27 18:35:04 IST [2445250]: user=,db=,app=,client=FATAL: could not write to file "pg_wal/xlogtemp.2445250": No space left on device
2023-12-27 18:35:04 IST [2445247]: user=,db=,app=,client=LOG: startup process (PID 2445250) exited with exit code 1
2023-12-27 18:35:04 IST [2445247]: user=,db=,app=,client=LOG: terminating any other active server processes
2023-12-27 18:35:04 IST [2445247]: user=,db=,app=,client=LOG: shutting down due to startup process failure
2023-12-27 18:35:04 IST [2445247]: user=,db=,app=,client=LOG: database system is shut down
2023-12-27 18:35:24 IST [2445311]: user=,db=,app=,client=LOG: starting PostgreSQL 14.10 (Ubuntu 14.10-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
2023-12-27 18:35:24 IST [2445311]: user=,db=,app=,client=LOG: listening on IPv4 address "<primary_ip>", port 5432
2023-12-27 18:35:24 IST [2445311]: user=,db=,app=,client=LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2023-12-27 18:35:24 IST [2445314]: user=,db=,app=,client=LOG: database system was interrupted while in recovery at log time 2023-12-27 18:33:30 IST
2023-12-27 18:35:24 IST [2445314]: user=,db=,app=,client=HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
2023-12-27 18:35:24 IST [2445314]: user=,db=,app=,client=WARNING: specified neither primary_conninfo nor restore_command
2023-12-27 18:35:24 IST [2445314]: user=,db=,app=,client=HINT: The database server will regularly poll the pg_wal subdirectory to check for files placed there.
2023-12-27 18:35:24 IST [2445314]: user=,db=,app=,client=LOG: entering standby mode
2023-12-27 18:35:24 IST [2445314]: user=,db=,app=,client=LOG: redo starts at 588/D80000D0
2023-12-27 18:35:24 IST [2445314]: user=,db=,app=,client=LOG: consistent recovery state reached at 588/D9000000
2023-12-27 18:35:24 IST [2445311]: user=,db=,app=,client=LOG: database system is ready to accept read-only connections
2023-12-27 18:35:25 IST [2445314]: user=,db=,app=,client=LOG: received promote request
2023-12-27 18:35:25 IST [2445314]: user=,db=,app=,client=LOG: redo done at 588/D80000D0 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.72 s
2023-12-27 18:35:25 IST [2445314]: user=,db=,app=,client=LOG: selected new timeline ID: 2
2023-12-27 18:35:25 IST [2445314]: user=,db=,app=,client=FATAL: could not write to file "pg_wal/xlogtemp.2445314": No space left on device
2023-12-27 18:35:25 IST [2445311]: user=,db=,app=,client=LOG: startup process (PID 2445314) exited with exit code 1
2023-12-27 18:35:25 IST [2445311]: user=,db=,app=,client=LOG: terminating any other active server processes
2023-12-27 18:35:25 IST [2445311]: user=,db=,app=,client=LOG: shutting down due to startup process failure
2023-12-27 18:35:25 IST [2445311]: user=,db=,app=,client=LOG: database system is shut down
2023-12-27 18:35:45 IST [2445371]: user=,db=,app=,client=LOG: starting PostgreSQL 14.10 (Ubuntu 14.10-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
2023-12-27 18:35:45 IST [2445371]: user=,db=,app=,client=LOG: listening on IPv4 address "<primary_ip>", port 5432
2023-12-27 18:35:45 IST [2445371]: user=,db=,app=,client=LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2023-12-27 18:35:45 IST [2445374]: user=,db=,app=,client=LOG: database system was interrupted while in recovery at log time 2023-12-27 18:33:30 IST
2023-12-27 18:35:45 IST [2445374]: user=,db=,app=,client=HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
2023-12-27 18:35:45 IST [2445377]: user=postgres,db=postgres,app=[unknown],client=<primary_ip>FATAL: the database system is starting up
2023-12-27 18:35:45 IST [2445378]: user=postgres,db=postgres,app=[unknown],client=<primary_ip>FATAL: the database system is starting up
2023-12-27 18:35:45 IST [2445374]: user=,db=,app=,client=WARNING: specified neither primary_conninfo nor restore_command
2023-12-27 18:35:45 IST [2445374]: user=,db=,app=,client=HINT: The database server will regularly poll the pg_wal subdirectory to check for files placed there.
2023-12-27 18:35:45 IST [2445374]: user=,db=,app=,client=LOG: entering standby mode
2023-12-27 18:35:45 IST [2445374]: user=,db=,app=,client=LOG: redo starts at 588/D80000D0
2023-12-27 18:35:45 IST [2445374]: user=,db=,app=,client=LOG: consistent recovery state reached at 588/D9000000
2023-12-27 18:35:45 IST [2445371]: user=,db=,app=,client=LOG: database system is ready to accept read-only connections
2023-12-27 18:35:46 IST [2445374]: user=,db=,app=,client=LOG: received promote request
2023-12-27 18:35:46 IST [2445374]: user=,db=,app=,client=LOG: redo done at 588/D80000D0 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.73 s
2023-12-27 18:35:46 IST [2445374]: user=,db=,app=,client=LOG: selected new timeline ID: 2
2023-12-27 18:35:46 IST [2445374]: user=,db=,app=,client=FATAL: could not write to file "pg_wal/xlogtemp.2445374": No space left on device
2023-12-27 18:35:46 IST [2445371]: user=,db=,app=,client=LOG: startup process (PID 2445374) exited with exit code 1
2023-12-27 18:35:46 IST [2445371]: user=,db=,app=,client=LOG: terminating any other active server processes
2023-12-27 18:35:46 IST [2445371]: user=,db=,app=,client=LOG: shutting down due to startup process failure
2023-12-27 18:35:46 IST [2445371]: user=,db=,app=,client=LOG: database system is shut down
2023-12-27 18:36:06 IST [2445433]: user=,db=,app=,client=LOG: starting PostgreSQL 14.10 (Ubuntu 14.10-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
2023-12-27 18:36:06 IST [2445433]: user=,db=,app=,client=LOG: listening on IPv4 address "<primary_ip>", port 5432
2023-12-27 18:36:06 IST [2445433]: user=,db=,app=,client=LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2023-12-27 18:36:06 IST [2445436]: user=,db=,app=,client=LOG: database system was interrupted while in recovery at log time 2023-12-27 18:33:30 IST
2023-12-27 18:36:06 IST [2445436]: user=,db=,app=,client=HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
2023-12-27 18:36:06 IST [2445439]: user=postgres,db=postgres,app=[unknown],client=<primary_ip>FATAL: the database system is starting up
2023-12-27 18:36:06 IST [2445440]: user=postgres,db=postgres,app=[unknown],client=<primary_ip>FATAL: the database system is starting up
2023-12-27 18:36:06 IST [2445436]: user=,db=,app=,client=WARNING: specified neither primary_conninfo nor restore_command
2023-12-27 18:36:06 IST [2445436]: user=,db=,app=,client=HINT: The database server will regularly poll the pg_wal subdirectory to check for files placed there.
2023-12-27 18:36:06 IST [2445436]: user=,db=,app=,client=LOG: entering standby mode
2023-12-27 18:36:06 IST [2445436]: user=,db=,app=,client=LOG: redo starts at 588/D80000D0
2023-12-27 18:36:06 IST [2445436]: user=,db=,app=,client=LOG: consistent recovery state reached at 588/D9000000
2023-12-27 18:36:06 IST [2445433]: user=,db=,app=,client=LOG: database system is ready to accept read-only connections
pg-node-1
2023-12-27 18:33:17 IST [2736117]: user=,db=,app=,client=FATAL: could not receive data from WAL stream: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2023-12-27 18:33:17 IST [2736113]: user=,db=,app=,client=LOG: unexpected pageaddr 588/5C000000 in log segment 0000000100000588000000D8, offset 0
2023-12-27 18:33:17 IST [502192]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: FATAL: the database system is in recovery mode
2023-12-27 18:33:22 IST [502202]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:33:27 IST [502215]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:33:32 IST [502234]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:33:37 IST [502247]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:33:42 IST [502258]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:33:47 IST [502267]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:33:52 IST [502274]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:33:57 IST [502281]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:34:02 IST [502292]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:34:07 IST [502299]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:34:12 IST [502306]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:34:17 IST [502315]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:34:22 IST [502323]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:34:27 IST [502330]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:34:32 IST [502341]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:34:37 IST [502348]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:34:42 IST [502356]: user=,db=,app=,client=LOG: started streaming WAL from primary at 588/D8000000 on timeline 1
2023-12-27 18:34:42 IST [2736113]: user=,db=,app=,client=LOG: successfully skipped missing contrecord at 588/D7FFFFB0, overwritten at 2023-12-27 18:33:30.167074+05:30
2023-12-27 18:34:42 IST [2736113]: user=,db=,app=,client=CONTEXT: WAL redo at 588/D8000028 for XLOG/OVERWRITE_CONTRECORD: lsn 588/D7FFFFB0; time 2023-12-27 18:33:30.167074+05:30
2023-12-27 18:34:42 IST [502356]: user=,db=,app=,client=FATAL: could not receive data from WAL stream: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2023-12-27 18:34:47 IST [2736113]: user=,db=,app=,client=LOG: unexpected pageaddr 588/5D000000 in log segment 0000000100000588000000D9, offset 0
2023-12-27 18:34:47 IST [502365]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:34:52 IST [502372]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:34:57 IST [502379]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:35:02 IST [502391]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:35:07 IST [502401]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:35:12 IST [502408]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:35:17 IST [502417]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:35:22 IST [502424]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:35:27 IST [502434]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:35:32 IST [502444]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:35:37 IST [502451]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:35:42 IST [502458]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:35:47 IST [502470]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:35:52 IST [502477]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:35:57 IST [502484]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:36:02 IST [502493]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:36:07 IST [502503]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:36:12 IST [502510]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:36:17 IST [502519]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:36:22 IST [502526]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:36:27 IST [502534]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: FATAL: the database system is starting up
2023-12-27 18:36:32 IST [502545]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:36:37 IST [502553]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:36:42 IST [502560]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:36:47 IST [502569]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:36:52 IST [502579]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:36:57 IST [502586]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:37:02 IST [502595]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:37:07 IST [502602]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:37:12 IST [502612]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:37:17 IST [502621]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:37:22 IST [502628]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:37:27 IST [502635]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:37:32 IST [502647]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:37:37 IST [502655]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:37:42 IST [502662]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
2023-12-27 18:37:47 IST [502671]: user=,db=,app=,client=FATAL: could not connect to the primary server: connection to server at "<primary_ip>", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
Have you tried to use GitHub issue search?
- Yes
Anything else we need to know?
Replica server is running at the half capacity compared to the Leader server in terms of CPU & RAM.
Switchover to standby should have happened
It is not a Patroni task to monitor free disk space.
If Postgres primary crashed (it usually happens when Postgres can't write to disk) Patroni is trying to start it up.
This behavior could be changed by setting primary_start_timeout to 0.
Patroni version: 2.1.3
Please upgrade to the latest version