ltcluster standby follow — attach a running standby to a new upstream node
Attaches the standby ("follow candidate") to a new upstream node ("follow target"). Typically this will be the primary, but this command can also be used to attach the standby to another standby.
This command requires a valid ltcluster.conf
file for the standby,
either specified explicitly with -f/--config-file
or located in a
default location; no additional arguments are required.
The standby node ("follow candidate") must be running. If the new upstream ("follow target") is not the primary, the cluster primary must be running and accessible from the standby node.
To re-add an inactive node to the replication cluster, use ltcluster node rejoin.
By default ltcluster will attempt to attach the standby to the current primary.
If --upstream-node-id
is provided, ltcluster will attempt
to attach the standby to the specified node, which can be another standby.
In LightDB 21 and later, by default this command will signal LightDB to reload its
configuration, which will cause LightDB to follow the new upstream without
a restart. If this behaviour is not desired for whatever reason, the configuration
file parameter standby_follow_restart
can be set true
to always force a restart.
ltcluster standby follow
will wait up to
standby_follow_timeout
seconds (default: 30
)
to verify the standby has actually connected to the new upstream node.
If recovery_min_apply_delay
is set for the standby, it
will not attach to the new upstream node until it has replayed available
WAL.
Conversely, if the standby is attached to an upstream standby
which has recovery_min_apply_delay
set, the upstream
standby's replay state may actually be behind that of its new downstream node.
$ ltcluster -f /etc/ltcluster.conf standby follow INFO: setting node 3's primary to node 2 NOTICE: restarting server using "lt_ctl -l /var/log/lightdb/startup.log -w -D '/var/lib/lightdb/data' restart" waiting for server to shut down........ done server stopped waiting for server to start.... done server started NOTICE: STANDBY FOLLOW successful DETAIL: node 3 is now attached to node 2
--dry-run
Check prerequisites but don't actually follow a new upstream node.
This will also verify whether the standby is capable of following the new upstream node.
If a standby was turned into a primary by removing recovery.conf
(LightDB 21 and later: standby.signal
),
ltcluster will not be able to determine whether that primary's timeline
has diverged from the timeline of the standby ("follow candidate").
We recommend always to use ltcluster standby promote
to promote a standby to primary, as this will ensure that the new primary
will perform a timeline switch (making it practical to check for timeline divergence)
and also that ltcluster metadata is updated correctly.
--upstream-node-id
Node ID of the new upstream node ("follow target").
If not provided, ltcluster will attempt to follow the current primary node.
Note that when using ltclusterd, --upstream-node-id
should always be configured;
see Automatic failover configuration
for details.
-w
--wait
Wait for a primary to appear. ltcluster will wait for up to
primary_follow_timeout
seconds
(default: 60 seconds) to verify that the standby is following the new primary.
This value can be defined in ltcluster.conf
.
Execute with the --dry-run
option to test the follow operation as
far as possible, without actually changing the status of the node.
Note that ltcluster will first attempt to determine whether the standby ("follow candidate") is capable of following the new upstream node ("follow target").
If, for example, the new upstream node has diverged from this node's timeline, for example if the new upstream node was promoted to primary while this node was still attached to the original primary, it will not be possible to follow the new upstream node, and ltcluster will emit an error message like this:
ERROR: this node cannot attach to follow target node "node3" (ID 3) DETAIL: follow target server's timeline 2 forked off current database system timeline 1 before current recovery point 0/6108880
In this case, it may be possible to have this node follow the new upstream
using ltcluster node rejoin
with the --force-rewind
to execute lt_rewind
.
This does mean that transactions which exist on this node, but not the new upstream,
will be lost.
One of the following exit codes will be emitted by ltcluster standby follow
:
SUCCESS (0)
The follow operation succeeded; or if --dry-run
was provided,
no issues were detected which would prevent the follow operation.
ERR_BAD_CONFIG (1)
A configuration issue was detected which prevented ltcluster from continuing with the follow operation.
ERR_NO_RESTART (4)
The node could not be restarted.
ERR_DB_CONN (6)
ltcluster was unable to establish a database connection to one of the nodes.
ERR_FOLLOW_FAIL (23)
ltcluster was unable to complete the follow command.
A standby_follow
event notification will be generated.
If provided, ltcluster will substitute the placeholders %p
with the node ID of the node
being followed, %c
with its conninfo
string, and
%a
with its node name.