ltcluster standby switchover — promote a standby to primary and demote the existing primary to a standby
Promotes a standby to primary and demotes the existing primary to a standby. This command must be run on the standby to be promoted, and requires a passwordless SSH connection to the current primary.
If other nodes are connected to the demotion candidate, ltcluster can instruct
these to follow the new primary if the option --siblings-follow
is specified. This requires a passwordless SSH connection between the promotion
candidate (new primary) and the nodes attached to the demotion candidate
(existing primary). Note that a witness server, if in use, is also
counted as a "sibling node" as it needs to be instructed to
synchronise its metadata with the new primary.
Performing a switchover is a non-trivial operation. In particular it relies on the current primary being able to shut down cleanly and quickly. ltcluster will attempt to check for potential issues but cannot guarantee a successful switchover.
ltcluster will refuse to perform the switchover if an exclusive backup is running on the current primary, or if WAL replay is paused on the standby.
For more details on performing a switchover, including preparation and configuration, see section Performing a switchover with ltcluster.
CHECKPOINT
ltcluster executes CHECKPOINT
on the demotion candidate as part of the shutdown
process to ensure it shuts down as smoothly as possible.
Note that CHECKPOINT
requires database superuser permissions to execute.
If the ltcluster
user is not a superuser, the name of a superuser should be
provided with the -S
/--superuser
.
If ltcluster is unable to execute the CHECKPOINT
command, the switchover
can still be carried out, albeit at a greater risk that the demotion candidate may not
be able to shut down as smoothly as might otherwise have been the case.
pg_promote() (LightDB 21 and later)
From LightDB 21, ltcluster defaults to using the built-in pg_promote()
function to
promote a standby to primary.
Note that execution of pg_promote()
is restricted to superusers or to
any user who has been granted execution permission for this function. If the ltcluster user
is not permitted to execute pg_promote()
, ltcluster will fall back to using
"lt_ctl promote
". For more details see
ltcluster standby promote.
--always-promote
Promote standby to primary, even if it is behind or has diverged from the original primary. The original primary will be shut down in any case, and will need to be manually reintegrated into the replication cluster.
--dry-run
Check prerequisites but don't actually execute a switchover.
Success of --dry-run
does not imply the switchover will
complete successfully, only that
the prerequisites for performing the operation are met.
-F
--force
Ignore warnings and continue anyway.
Specifically, if a problem is encountered when shutting down the current primary,
using -F/--force
will cause ltcluster to continue by promoting
the standby to be the new primary, and if --siblings-follow
is
specified, attach any other standbys to the new primary.
--force-rewind[=/path/to/lt_rewind]
Use lt_rewind to reintegrate the old primary if necessary
(and the prerequisites for using lt_rewind are met).
If using LightDB 21, and the lt_rewind
binary is not installed in the LightDB bin
directory,
provide its full path. For more details see also Switchover and lt_rewind.
-R
--remote-user
System username for remote SSH operations (defaults to local system user).
--ltclusterd-no-pause
Don't pause ltclusterd while executing a switchover.
This option should not be used unless you take steps by other means to ensure ltclusterd is paused or not running on all nodes.
This option cannot be used together with --ltclusterd-force-unpause
.
--ltclusterd-force-unpause
Always unpause all ltclusterd instances after executing a switchover. This will ensure that any ltclusterd instances which were paused before the switchover will be unpaused.
This option cannot be used together with --ltclusterd-no-pause
.
--siblings-follow
Have nodes attached to the old primary follow the new primary.
This will also ensure that a witness node, if in use, is updated with the new primary's data.
In a future ltcluster release, --siblings-follow
will be applied
by default.
-S
/--superuser
Use the named superuser instead of the normal ltcluster user to perform actions requiring superuser permissions.
The following parameters in ltcluster.conf
are relevant to the
switchover operation:
replication_lag_critical
If replication lag (in seconds) on the standby exceeds this value, the
switchover will be aborted (unless the -F/--force
option
is provided)
shutdown_check_timeout
The maximum number of seconds to wait for the demotion candidate (current primary) to shut down, before aborting the switchover.
Note that this parameter is set on the node where ltcluster standby switchover
is executed (promotion candidate); setting it on the demotion candidate (former primary) will
have no effect.
wal_receive_check_timeout
After the primary has shut down, the maximum number of seconds to wait for the walreceiver on the standby to flush WAL to disk before comparing WAL receive location with the primary's shut down location.
standby_reconnect_timeout
The maximum number of seconds to attempt to wait for the demotion candidate (former primary) to reconnect to the promoted primary (default: 60 seconds)
Note that this parameter is set on the node where ltcluster standby switchover
is executed (promotion candidate); setting it on the demotion candidate (former primary) will
have no effect.
node_rejoin_timeout
maximum number of seconds to attempt to wait for the demotion candidate (former primary) to reconnect to the promoted primary (default: 60 seconds)
Note that this parameter is set on the the demotion candidate (former primary);
setting it on the node where ltcluster standby switchover
is
executed will have no effect.
However, this value must be less than standby_reconnect_timeout
on the
promotion candidate (the node where ltcluster standby switchover
is executed).
Execute with the --dry-run
option to test the switchover as far as
possible without actually changing the status of either node.
External database connections, e.g. from an application, should not be permitted while the switchover is taking place. In particular, active transactions on the primary can potentially disrupt the shutdown process.
standby_switchover
and standby_promote
event notifications will be generated for the new primary,
and a node_rejoin
event notification for the former primary (new standby).
If using an event notification script, standby_switchover
will populate the placeholder parameter %p
with the node ID of
the former primary.
One of the following exit codes will be emitted by ltcluster standby switchover
:
SUCCESS (0)
The switchover completed successfully; or if --dry-run
was provided,
no issues were detected which would prevent the switchover operation.
ERR_SWITCHOVER_FAIL (18)
The switchover could not be executed.
ERR_SWITCHOVER_INCOMPLETE (22)
The switchover was executed but a problem was encountered. Typically this means the former primary could not be reattached as a standby. Check preceding log messages for more information.
ltcluster standby follow, ltcluster node rejoin
For more details on performing a switchover operation, see the section Performing a switchover with ltcluster.