10.3. Primary visibility consensus

In more complex replication setups, particularly where replication occurs between multiple datacentres, it's possible that some but not all standbys get cut off from the primary (but not from the other standbys).

In this situation, normally it's not desirable for any of the standbys which have been cut off to initiate a failover, as the primary is still functioning and standbys are connected. Beginning with ltcluster 5 it is now possible for the affected standbys to build a consensus about whether the primary is still available to some standbys ("primary visibility consensus"). This is done by polling each standby (and the witness, if present) for the time it last saw the primary; if any have seen the primary very recently, it's reasonable to infer that the primary is still available and a failover should not be started.

The time the primary was last seen by each node can be checked by executing ltcluster service status (ltcluster 4.2 - 4.4: ltcluster daemon status) which includes this in its output, e.g.:

$ ltcluster -f /etc/ltcluster.conf service status
 ID | Name  | Role    | Status    | Upstream | ltclusterd | PID   | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
 1  | node1 | primary | * running |          | running | 27259 | no      | n/a
 2  | node2 | standby |   running | node1    | running | 27272 | no      | 1 second(s) ago
 3  | node3 | standby |   running | node1    | running | 27282 | no      | 0 second(s) ago
 4  | node4 | witness | * running | node1    | running | 27298 | no      | 1 second(s) ago

To enable this functionality, in ltcluster.conf set:

      primary_visibility_consensus=true

Note

primary_visibility_consensus must be set to true on all nodes for it to be effective.

The following sample ltclusterd log output demonstrates the behaviour in a situation where one of three standbys is no longer able to connect to the primary, but can connect to the two other standbys ("sibling nodes"):

    [2019-05-17 05:36:12] [WARNING] unable to reconnect to node 1 after 3 attempts
    [2019-05-17 05:36:12] [INFO] 2 active sibling nodes registered
    [2019-05-17 05:36:12] [INFO] local node's last receive lsn: 0/7006E58
    [2019-05-17 05:36:12] [INFO] checking state of sibling node "node3" (ID: 3)
    [2019-05-17 05:36:12] [INFO] node "node3" (ID: 3) reports its upstream is node 1, last seen 1 second(s) ago
    [2019-05-17 05:36:12] [NOTICE] node 3 last saw primary node 1 second(s) ago, considering primary still visible
    [2019-05-17 05:36:12] [INFO] last receive LSN for sibling node "node3" (ID: 3) is: 0/7006E58
    [2019-05-17 05:36:12] [INFO] node "node3" (ID: 3) has same LSN as current candidate "node2" (ID: 2)
    [2019-05-17 05:36:12] [INFO] checking state of sibling node "node4" (ID: 4)
    [2019-05-17 05:36:12] [INFO] node "node4" (ID: 4) reports its upstream is node 1, last seen 0 second(s) ago
    [2019-05-17 05:36:12] [NOTICE] node 4 last saw primary node 0 second(s) ago, considering primary still visible
    [2019-05-17 05:36:12] [INFO] last receive LSN for sibling node "node4" (ID: 4) is: 0/7006E58
    [2019-05-17 05:36:12] [INFO] node "node4" (ID: 4) has same LSN as current candidate "node2" (ID: 2)
    [2019-05-17 05:36:12] [INFO] 2 nodes can see the primary
    [2019-05-17 05:36:12] [DETAIL] following nodes can see the primary:
     - node "node3" (ID: 3): 1 second(s) ago
     - node "node4" (ID: 4): 0 second(s) ago
    [2019-05-17 05:36:12] [NOTICE] cancelling failover as some nodes can still see the primary
    [2019-05-17 05:36:12] [NOTICE] election cancelled
    [2019-05-17 05:36:14] [INFO] node "node2" (ID: 2) monitoring upstream node "node1" (ID: 1) in degraded state

In this situation it will cancel the failover and enter degraded monitoring node, waiting for the primary to reappear.