Chapter 5. Promoting a standby server with ltcluster

If a primary server fails or needs to be removed from the replication cluster, a new primary server must be designated, to ensure the cluster continues to function correctly. This can be done with ltcluster standby promote, which promotes the standby on the current server to primary.

To demonstrate this, set up a replication cluster with a primary and two attached standby servers so that the cluster looks like this:

     $ ltcluster -f /etc/ltcluster.conf cluster show
     ID | Name  | Role    | Status    | Upstream | Location | Connection string
    ----+-------+---------+-----------+----------+----------+--------------------------------------
     1  | node1 | primary | * running |          | default  | host=node1 dbname=ltcluster user=ltcluster
     2  | node2 | standby |   running | node1    | default  | host=node2 dbname=ltcluster user=ltcluster
     3  | node3 | standby |   running | node1    | default  | host=node3 dbname=ltcluster user=ltcluster

Stop the current primary with e.g.:

   $ lt_ctl -D /var/lib/lightdb/data -m fast stop

At this point the replication cluster will be in a partially disabled state, with both standbys accepting read-only connections while attempting to connect to the stopped primary. Note that the ltcluster metadata table will not yet have been updated; executing ltcluster cluster show will note the discrepancy:

    $ ltcluster -f /etc/ltcluster.conf cluster show
     ID | Name  | Role    | Status        | Upstream | Location | Connection string
    ----+-------+---------+---------------+----------+----------+--------------------------------------
     1  | node1 | primary | ? unreachable |          | default  | host=node1 dbname=ltcluster user=ltcluster
     2  | node2 | standby |   running     | node1    | default  | host=node2 dbname=ltcluster user=ltcluster
     3  | node3 | standby |   running     | node1    | default  | host=node3 dbname=ltcluster user=ltcluster

    WARNING: following issues were detected
    node "node1" (ID: 1) is registered as an active primary but is unreachable

Now promote the first standby with:

   $ ltcluster -f /etc/ltcluster.conf standby promote

This will produce output similar to the following:

    INFO: connecting to standby database
    NOTICE: promoting standby
    DETAIL: promoting server using "lt_ctl -l /var/log/lightdb/startup.log -w -D '/var/lib/lightdb/data' promote"
    server promoting
    INFO: reconnecting to promoted server
    NOTICE: STANDBY PROMOTE successful
    DETAIL: node 2 was successfully promoted to primary

Executing ltcluster cluster show will show the current state; as there is now an active primary, the previous warning will not be displayed:

    $ ltcluster -f /etc/ltcluster.conf cluster show
     ID | Name  | Role    | Status    | Upstream | Location | Connection string
    ----+-------+---------+-----------+----------+----------+--------------------------------------
     1  | node1 | primary | - failed  |          | default  | host=node1 dbname=ltcluster user=ltcluster
     2  | node2 | primary | * running |          | default  | host=node2 dbname=ltcluster user=ltcluster
     3  | node3 | standby |   running | node1    | default  | host=node3 dbname=ltcluster user=ltcluster

However the sole remaining standby (node3) is still trying to replicate from the failed primary; ltcluster standby follow must now be executed to rectify this situation (see Chapter 6 for example).