EnterpriseDB / repmgr

A lightweight replication manager for PostgreSQL (Postgres)

Home Page:https://repmgr.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to identify old and new primary database from repmgr?

wasiualhasib opened this issue · comments

Basically, I am loading balance data using pgpool but I am fear that if primary node(node1) down then another node (node2) will be the new primary then how can I tell pgpool that now new primary node is node1?

When both primary is up then from the database side it is difficult to identify which one is the actual primary using the command
" select * from nodes;" in repmgr schema. Because old primary node1 show he is now primary and node2 is registered as standby but running as primary, but from new primary it says node1 and node2 both are primary but node2 is new primary but node1 !running as primary.

This type of issue happened when the node reboots and takes too long time to reboot. Then repmgr will promote another node as a primary. But when reboot is completed after long time, in that case database will up again as a old primary and keep running. So in that case it is necessary to inform pgpool that node1 is old primary don't connect with it please connect to node2 as new primary.

Is there any way to identify which one is now the new primary from the old primary node (node1)? it could be using a query or script or from a database level.

Need expert explanation/help on this. @ibarwick could you help me on this?

`[postgres@DB1 ~]$ repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+----------------------+----------+----------+----------+----------+-------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100 | 7 | host=10.16.10.17 user=repmgr dbname=repmgr connect_timeout=2 port=5678
2 | node2 | standby | ! running as primary | | default | 100 | 8 | host=10.16.10.18 user=repmgr dbname=repmgr connect_timeout=2 port=5678
3 | node3 | standby | running | ! node2 | default | 100 | 8 | host=10.16.10.19 user=repmgr dbname=repmgr connect_timeout=2 port=5678
4 | Witness | witness | * running | ! node2 | default | 0 | n/a | host=10.16.10.20 user=repmgr dbname=repmgr connect_timeout=2 port=5678

WARNING: following issues were detected

  • node "node2" (ID: 2) is registered as standby but running as primary
  • node "node3" (ID: 3) reports a different upstream (reported: "node2", expected "node1")
  • node "Witness" (ID: 4) reports a different upstream (reported: "node2", expected "node1")

[postgres@DB2 ~]$ repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+-------------------------------------------------------------------------
1 | node1 | primary | ! running | | default | 100 | 7 | host=10.16.10.17 user=repmgr dbname=repmgr connect_timeout=2 port=5678
2 | node2 | primary | * running | | default | 100 | 8 | host=10.16.10.18 user=repmgr dbname=repmgr connect_timeout=2 port=5678
3 | node3 | standby | running | node2 | default | 100 | 8 | host=10.16.10.19 user=repmgr dbname=repmgr connect_timeout=2 port=5678
4 | Witness | witness | * running | node2 | default | 0 | n/a | host=10.16.10.20 user=repmgr dbname=repmgr connect_timeout=2 port=5678

WARNING: following issues were detected

  • node "node1" (ID: 1) is running but the repmgr node record is inactive
    `

I'm familiar with this situation.

First of all I'd recommend not to reboot the primary node during planned maintenance, but switchover to another node first and avoid all of this.

This doesn’t solve the problem during outages, but the lower value of the Timeline column clearly shows node1 is obsolete in this case. I don’t know why the developers of repmgr do not utilize this to improve the functionality of the tool.