ltcluster cluster matrix — runs ltcluster cluster show on each node and summarizes output
ltcluster cluster matrix
runs ltcluster cluster show
on each
node and arranges the results in a matrix, recording success or failure.
ltcluster cluster matrix
requires a valid ltcluster.conf
file on each node. Additionally, passwordless ssh
connections are required between
all nodes.
Example 1 (all nodes up):
$ ltcluster -f /etc/ltcluster.conf cluster matrix Name | Id | 1 | 2 | 3 -------+----+----+----+---- node1 | 1 | * | * | * node2 | 2 | * | * | * node3 | 3 | * | * | *
Example 2 (node1
and node2
up, node3
down):
$ ltcluster -f /etc/ltcluster.conf cluster matrix Name | Id | 1 | 2 | 3 -------+----+----+----+---- node1 | 1 | * | * | x node2 | 2 | * | * | x node3 | 3 | ? | ? | ?
Each row corresponds to one server, and indicates the result of testing an outbound connection from that server.
Since node3
is down, all the entries in its row are filled with
?
, meaning that there we cannot test outbound connections.
The other two nodes are up; the corresponding rows have x
in the
column corresponding to node3
, meaning that inbound connections to
that node have failed, and *
in the columns corresponding to
node1
and node2
, meaning that inbound connections
to these nodes have succeeded.
Example 3 (all nodes up, firewall dropping packets originating
from node1
and directed to port 5432 on node3
) -
running ltcluster cluster matrix
from node1
gives the following output:
$ ltcluster -f /etc/ltcluster.conf cluster matrix Name | Id | 1 | 2 | 3 -------+----+----+----+---- node1 | 1 | * | * | x node2 | 2 | * | * | * node3 | 3 | ? | ? | ?
Note this may take some time depending on the connect_timeout
setting in the node conninfo
strings; default is
1 minute
which means without modification the above
command would take around 2 minutes to run; see comment elsewhere about setting
connect_timeout
)
The matrix tells us that we cannot connect from node1
to node3
,
and that (therefore) we don't know the state of any outbound
connection from node3
.
In this case, the ltcluster cluster crosscheck command will produce a more useful result.
One of the following exit codes will be emitted by ltcluster cluster matrix
:
SUCCESS (0)
The check completed successfully and all nodes are reachable.
ERR_BAD_SSH (12)
One or more nodes could not be accessed via SSH.
ERR_NODE_STATUS (25)
LightDB on one or more nodes could not be reached.
This error code overrides ERR_BAD_SSH
.