If a primary server fails or needs to be removed from the replication cluster, a new primary server must be designated, to ensure the cluster continues to function correctly. This can be done with ltcluster standby promote, which promotes the standby on the current server to primary.
To demonstrate this, set up a replication cluster with a primary and two attached standby servers so that the cluster looks like this:
$ ltcluster -f /etc/ltcluster.conf cluster show ID | Name | Role | Status | Upstream | Location | Connection string ----+-------+---------+-----------+----------+----------+-------------------------------------- 1 | node1 | primary | * running | | default | host=node1 dbname=ltcluster user=ltcluster 2 | node2 | standby | running | node1 | default | host=node2 dbname=ltcluster user=ltcluster 3 | node3 | standby | running | node1 | default | host=node3 dbname=ltcluster user=ltcluster
Stop the current primary with e.g.:
$ lt_ctl -D /var/lib/lightdb/data -m fast stop
At this point the replication cluster will be in a partially disabled state, with both standbys accepting read-only connections while attempting to connect to the stopped primary. Note that the ltcluster metadata table will not yet have been updated; executing ltcluster cluster show will note the discrepancy:
$ ltcluster -f /etc/ltcluster.conf cluster show ID | Name | Role | Status | Upstream | Location | Connection string ----+-------+---------+---------------+----------+----------+-------------------------------------- 1 | node1 | primary | ? unreachable | | default | host=node1 dbname=ltcluster user=ltcluster 2 | node2 | standby | running | node1 | default | host=node2 dbname=ltcluster user=ltcluster 3 | node3 | standby | running | node1 | default | host=node3 dbname=ltcluster user=ltcluster WARNING: following issues were detected node "node1" (ID: 1) is registered as an active primary but is unreachable
Now promote the first standby with:
$ ltcluster -f /etc/ltcluster.conf standby promote
This will produce output similar to the following:
INFO: connecting to standby database NOTICE: promoting standby DETAIL: promoting server using "lt_ctl -l /var/log/lightdb/startup.log -w -D '/var/lib/lightdb/data' promote" server promoting INFO: reconnecting to promoted server NOTICE: STANDBY PROMOTE successful DETAIL: node 2 was successfully promoted to primary
Executing ltcluster cluster show will show the current state; as there is now an active primary, the previous warning will not be displayed:
$ ltcluster -f /etc/ltcluster.conf cluster show ID | Name | Role | Status | Upstream | Location | Connection string ----+-------+---------+-----------+----------+----------+-------------------------------------- 1 | node1 | primary | - failed | | default | host=node1 dbname=ltcluster user=ltcluster 2 | node2 | primary | * running | | default | host=node2 dbname=ltcluster user=ltcluster 3 | node3 | standby | running | node1 | default | host=node3 dbname=ltcluster user=ltcluster
However the sole remaining standby (node3
) is still trying to replicate from the failed
primary; ltcluster standby follow must now be executed to rectify this situation
(see Chapter 6 for example).