前言
===================

本文提供了手工搭建分布式的步骤及方法，以便自动化程序开发者及客户使用参考。

环境基本信息
===================

本次测试环境为如下8台机器:

.. code:: 

   172.16.0.11
   172.16.0.12
   172.16.0.13
   172.16.0.14
   172.16.0.15
   172.16.0.16
   172.16.0.17
   172.16.0.18

为了描述方便，后面指定具体哪台机器时，仅用ip地址末段11,12..18来指代。

.. _install_node:

安装分布式节点
=================

.. _install_single_node:

安装单机版数据库
-----------------

首先获取 ``LightDB 22.4`` 版本的安装包，解压后，参考 `LightDB安装手册 <http://www.light-pg.com/docs/LightDB_Install_Manual/current/install.html#id9>`__ ，
在每台机器上安装一个单机版实例。

为了便于操作，安装程序会在 ``~/.bashrc`` 中写入LightDB相关的环境变量，可以重新登录一下shell，或者执行 ``source ~/.bashrc`` 使得环境变量生效。

我们演示环境把数据库实例安装在 ``/data/lightdb`` 目录下，您在具体操作时可以安装到其他目录下，只要lightdb用户有对应目录的权限即可。

安装完成后，可以查看 ``LTHOME`` 和 ``LTDATA`` 环境变量确定实际安装目录和实例目录:

.. code:: shell

   [lightdb@0b3770c2d30a ~]$ echo $LTHOME
   /data/lightdb/lightdb-x/13.8-22.4
   [lightdb@0b3770c2d30a ~]$ echo $LTDATA
   /data/lightdb/lightdb-x/13.8-22.4/data/defaultCluster/


.. _enable_distributed:

启用分布式插件
---------------

LightDB分布式功能是在canopy插件中实现的，在单机版环境中是没有启用canopy插件，需要在每台机器上手工启用一下。

首先编辑 ``$LTDATA/lightdb.conf`` 文件， 修改GUC参数 ``shared_preload_libraries`` ，在参数最开头加上 ``canopy`` 。

例如:

.. code:: shell

   # 原始值
   shared_preload_libraries='lt_stat_statements,lt_stat_activity,lt_prewarm,lt_cron,lt_hint_plan,lt_show_plans'

   # 修改后(注意要在开头加canopy，不能加在其他位置)
   shared_preload_libraries='canopy,lt_stat_statements,lt_stat_activity,lt_prewarm,lt_cron,lt_hint_plan,lt_show_plans'

添加完成后，重启一下数据库让参数生效: ``lt_ctl restart`` 。

使用 ``ltsql`` 工具登录数据库，创建测试数据库 ``test1`` 和 ``canopy`` 插件。

.. code:: shell

   [lightdb@0b3770c2d30a ~]$ ltsql 
   ltsql (13.8-22.4)
   Type "help" for help.

   # 注: 创建test1库用于测试
   lightdb@postgres=# CREATE DATABASE  test1;
   NOTICE:  Canopy partially supports CREATE DATABASE for distributed databases
   DETAIL:  Canopy does not propagate CREATE DATABASE command to workers
   HINT:  You can manually create a database and its extensions on workers.
   CREATE DATABASE

   # 注: 切换到刚创建的test1库中，(后续可以使用ltsql -d test1直接登录到test1库中)
   lightdb@postgres=# \c test1
   You are now connected to database "test1" as user "lightdb".

   # 注: 创建canopy插件
   lightdb@test1=# CREATE EXTENSION canopy;
   CREATE EXTENSION


此时插件就创建成功，需要注意的是插件是和数据库关联的，我们演示环境是在 ``test1`` 库中，您也可以在其他库中做。


LightDB免密配置
----------------

因为需要在多个节点中相互调用，以执行分布式执行计划。 所以我们指定所有节点机器之间相互登录数据库都是免密的，配置方法如下:

编辑文件: ``$LTDATA/lt_hba.conf`` 添加如下配置项:

.. code:: shell

   host    all             all             172.16.0.1/24           trust
   host    replication     all             172.16.0.1/24           trust

上面配置项的含义就是所有 ``172.16.0`` 网段的客户端都可以免密登录数据库。

编辑完成后，调用 ``lt_ctl reload`` 重新加载一下配置文件让配置生效(无需重启数据库)。

1CN和2DN的分布式部署
=====================

基于LightDB单机版在3台机器上搭建一个由1CN和2DN组成的分布式集群。
LightDB的安装程序是支持这种部署方式的(多机单实例部署方式)， 
这里为了演示部署细节，采用基于单机版LightDB的基础上部署。

.. graphviz::

   digraph foo {
      edge [ fontname="NSimSun",fontsize=10];
      node [ fontname="NSimSun",shape = box];
      CN[label="CN(11)"];
      DN1[label="DN(12)"];
      DN2[label="DN(13)"];
      
      CN -> {DN1, DN2}
      {rank=same; DN1, DN2};
   }

部署分布式节点
---------------

参考章节: :ref:`install_node` , 在11,12,13 三台机器上各部署分布式节点。


添加分布式节点
---------------

我们在11(CN)机器上登录数据库，添加两个DN节点:

.. code:: shell

   [lightdb@0b3770c2d30a defaultCluster]$ ltsql -d test1
   ltsql (13.8-22.4)
   Type "help" for help.

   lightdb@test1=# SELECT canopy_add_node('172.16.0.12', 5432);
   canopy_add_node 
   -----------------
                  2
   (1 row)

   lightdb@test1=# SELECT canopy_add_node('172.16.0.13', 5432);
   canopy_add_node 
   -----------------
                  3
   (1 row)

   lightdb@test1=# SELECT canopy_set_coordinator_host('172.16.0.11', 5432);
   canopy_set_coordinator_host 
   -----------------------------
   
   (1 row)


这样就形成了以11为CN节点，12和13为DN节点的分布式架构。

我们可以查询 ``pg_dist_node`` 表获取节点信息:

.. code:: shell

   lightdb@test1=# select * from pg_dist_node;
   nodeid | groupid |  nodename   | nodeport | noderack | hasmetadata | isactive | noderole | nodecluster | metadatasynced | shouldhaveshards 
   --------+---------+-------------+----------+----------+-------------+----------+----------+-------------+----------------+------------------
         2 |       2 | 172.16.0.12 |     5432 | default  | t           | t        | primary  | default     | t              | t
         3 |       3 | 172.16.0.13 |     5432 | default  | t           | t        | primary  | default     | t              | t
         5 |       0 | 172.16.0.11 |     5432 | default  | t           | t        | primary  | default     | t              | f
   (3 rows)


分布式测试
---------------

我们创建一个分布式表简单测试一下

.. code:: shell

   # 创建一个普通本地表
   lightdb@test1=# create table test_table(id int primary key, name text);
   CREATE TABLE

   # 插入10w条测试数据
   lightdb@test1=# insert into test_table select v, v || 'name' from generate_series(1,100000) as v;
   INSERT 0 100000

   # 把普通表改为分布式表
   lightdb@test1=# select create_distributed_table('test_table', 'id');
   NOTICE:  Copying data from local table...
   NOTICE:  copying the data has completed
   DETAIL:  The local data in the table is no longer visible, but is still on disk.
   HINT:  To remove the local data, run: SELECT truncate_local_data_after_distributing_table($$public.test_table$$)
   create_distributed_table 
   --------------------------
   
   (1 row)


此时我们成功创建了一个分布式表，我们在做查询操作时，会走分布式执行计划：

.. code:: shell 

   lightdb@test1=# explain select count(*) from test_table;
                                                QUERY PLAN                                               
   --------------------------------------------------------------------------------------------------------
   Aggregate  (cost=250.00..250.02 rows=1 width=8)
      ->  Custom Scan (Canopy Adaptive)  (cost=0.00..0.00 rows=100000 width=8)
            Task Count: 32
            Tasks Shown: One of 32
            ->  Task
                  Node: host=172.16.0.12 port=5432 dbname=test1
                  ->  Aggregate  (cost=43.99..44.00 rows=1 width=8)
                        ->  Seq Scan on test_table_102040 test_table  (cost=0.00..38.59 rows=2159 width=0)
   (8 rows)

我们可以通过 ``canopy_tables`` 查看分布式表信息:

.. code:: shell

   lightdb@test1=# select * from canopy_tables;
   -[ RECORD 1 ]-------+------------
   table_name          | test_table
   canopy_table_type   | distributed
   distribution_column | id
   colocation_id       | 2
   table_size          | 7488 kB
   shard_count         | 32
   table_owner         | lightdb
   access_method       | heap

我们通过 ``canopy_shards`` 表查看表数据的分布情况:

.. code:: shell

   lightdb@test1=# select table_name,shardid,shard_name,nodename,nodeport from canopy_shards;
   table_name | shardid |    shard_name     |  nodename   | nodeport 
   ------------+---------+-------------------+-------------+----------
   test_table |  102040 | test_table_102040 | 172.16.0.12 |     5432
   test_table |  102041 | test_table_102041 | 172.16.0.13 |     5432
   test_table |  102042 | test_table_102042 | 172.16.0.12 |     5432
   test_table |  102043 | test_table_102043 | 172.16.0.13 |     5432
   ...
   test_table |  102068 | test_table_102068 | 172.16.0.12 |     5432
   test_table |  102069 | test_table_102069 | 172.16.0.13 |     5432
   test_table |  102070 | test_table_102070 | 172.16.0.12 |     5432
   test_table |  102071 | test_table_102071 | 172.16.0.13 |     5432
   (32 rows)


我们可以看到数据分片分布在12,13两台机器上。

DN节点扩容
=====================

我们把前面部署的1CN/2DN架构添加1个DN节点扩展为1CN/3DN架构。

.. graphviz::

   digraph foo {
      edge [ fontname="NSimSun",fontsize=10];
      node [ fontname="NSimSun",shape = box];
      CN[label="CN(11)"];
      DN1[label="DN(12)"];
      DN2[label="DN(13)"];
      DN3[label="DN(14)"];
      
      CN -> {DN1, DN2, DN3}
      {rank=same; DN1, DN2, DN3};
   }

部署分布式节点
-----------------

在14机器上部署分布式节点（参考章节: :ref:`install_node` ）。

添加分布式节点
----------------

在11(CN)上执行SQL添加DN节点:

.. code:: shell

   lightdb@test1=# SELECT canopy_add_node('172.16.0.14', 5432);
   canopy_add_node 
   -----------------
                  6
   (1 row)

重新分布数据
----------------

添加新的DN节点后，已有的分布式表是不会自动扩展到新节点的。新创建的分布式表可以自动分布式到14节点。
如果已有的分布式表确实需要调整，则需要执行 ``rebalance_table_shards`` 操作，本操作需要依赖GUC参数 ``wal_level`` 调整。

.. code:: shell

   lightdb@test1=# SELECT rebalance_table_shards('test_table');
   NOTICE:  Moving shard 102041 from 172.16.0.13:5432 to 172.16.0.14:5432 ...
   ERROR:  ERROR:  logical decoding requires wal_level >= logical
   CONTEXT:  while executing command on 172.16.0.13:5432
   while executing command on localhost:5432

``wal_level`` 参数调整方法可以参考章节  :ref:`enable_distributed` 中的 ``shared_preload_libraries`` 参数修改方法，
把 ``wal_level`` 修改为 ``wal_level=logical`` 。

修改后，可以重新执行分片操作: 

.. code:: shell

   lightdb@test1=# SELECT rebalance_table_shards('test_table');
   NOTICE:  Moving shard 102041 from 172.16.0.13:5432 to 172.16.0.14:5432 ...
   NOTICE:  Moving shard 102040 from 172.16.0.12:5432 to 172.16.0.14:5432 ...
   NOTICE:  Moving shard 102043 from 172.16.0.13:5432 to 172.16.0.14:5432 ...
   NOTICE:  Moving shard 102042 from 172.16.0.12:5432 to 172.16.0.14:5432 ...
   NOTICE:  Moving shard 102045 from 172.16.0.13:5432 to 172.16.0.14:5432 ...
   NOTICE:  Moving shard 102044 from 172.16.0.12:5432 to 172.16.0.14:5432 ...
   NOTICE:  Moving shard 102047 from 172.16.0.13:5432 to 172.16.0.14:5432 ...
   NOTICE:  Moving shard 102046 from 172.16.0.12:5432 to 172.16.0.14:5432 ...
   NOTICE:  Moving shard 102049 from 172.16.0.13:5432 to 172.16.0.14:5432 ...
   NOTICE:  Moving shard 102048 from 172.16.0.12:5432 to 172.16.0.14:5432 ...
   rebalance_table_shards 
   ------------------------
   
   (1 row)

重新分片后，可以再次查看 ``canopy_shards`` 表查看分布情况，可以看到部分分片已经分布到14节点上:

.. code:: shell

   lightdb@test1=# select table_name,shardid,shard_name,nodename,nodeport from canopy_shards;
   table_name | shardid |    shard_name     |  nodename   | nodeport 
   ------------+---------+-------------------+-------------+----------
   test_table |  102040 | test_table_102040 | 172.16.0.14 |     5432
   test_table |  102041 | test_table_102041 | 172.16.0.14 |     5432
   ...
   test_table |  102070 | test_table_102070 | 172.16.0.12 |     5432
   test_table |  102071 | test_table_102071 | 172.16.0.13 |     5432
   (32 rows)

.. _install_ha:

CN节点扩容和高可用部署
======================


LightDB分布式的CN节点和DN节点都支持高可用部署形式，以获得更好的可靠性和性能。

在这种架构下，每个节点都是一个小的高可用集群。

在本节中，我们以11节点添加一个备机15为例讲解如何搭建一主一备高可用集群(基于ltcluster)，并且让我们分布式环境形成如下两个CN的结构:

.. graphviz::

   digraph foo {
      edge [ fontname="NSimSun",fontsize=10 ];
      node [ fontname="NSimSun",shape=box ];
      CN1[label="CN(11)\nprimary"];
      CN2[label="CN2(15)\nstandby"];
      DN1[label="DN(12)"];
      DN2[label="DN(13)"];
      DN3[label="DN(14)"];
      
      CN1 -> CN2 [style=dotted];
      CN1 -> {DN1, DN2, DN3}
      CN2 -> {DN1, DN2, DN3} [style=dotted]
      {rank=same; DN1, DN2, DN3};
      {rank=same; CN1, CN2};
   }

如果您使用Patroni可以参考章节( :ref:`install_patroni` )的方法来部署高可用。

部署主节点
----------------

我们先给11节点注册为ltcluster的主节点，纳入ltcluster的管理。

在11上创建配置文件 ``${LTHOME}/etc/ltcluster/ltcluster.conf`` :

.. code:: shell

   cat>${LTHOME}/etc/ltcluster/ltcluster.conf<<EOF
   node_id=11
   node_name='cn-11'
   conninfo='host=172.16.0.11 port=5432 user=ltcluster dbname=ltcluster connect_timeout=2'
   data_directory='${LTDATA}'
   pg_bindir='${LTHOME}/bin'
   failover='automatic'
   log_level=INFO
   log_facility=STDERR
   log_file='${LTHOME}/etc/ltcluster/ltcluster.log'
   shutdown_check_timeout=1800
   use_replication_slots=true
   promote_command='${LTHOME}/bin/ltcluster standby promote -f ${LTHOME}/etc/ltcluster/ltcluster.conf'
   follow_command='${LTHOME}/bin/ltcluster standby follow -f ${LTHOME}/etc/ltcluster/ltcluster.conf  --upstream-node-id=%n'
   EOF

在11数据库上创建ltcluster的专用数据库和用户:

.. code:: shell

   ltsql -c "CREATE ROLE ltcluster SUPERUSER PASSWORD 'ltcluster' login;"
   ltsql -c "CREATE DATABASE ltcluster OWNER ltcluster;"

修改 ``shared_preload_libraries`` 参数, 添加 ``ltcluster``配置项(参考章节  :ref:`enable_distributed` ), 
注意canopy有顺序要求，要在最前面，ltcluster添加在canopy后面即可。添加后使用 ``lt_ctl restart`` 重启数据库让配置生效。

把11数据库注册为ltcluster主节点:

.. code:: shell

   ltcluster primary register -f ${LTHOME}/etc/ltcluster/ltcluster.conf -F

启动ltclusterd守护进程:

.. code:: shell

   ltclusterd -d \
      -f ${LTHOME}/etc/ltcluster/ltcluster.conf \
      -p ${LTHOME}/etc/ltcluster/ltcluster.pid

通过如下命令可以查看ltcluster集群状态，此时只有一个节点

.. code:: shell

   [lightdb@0b3770c2d30a ~]$ ltcluster -f ${LTHOME}/etc/ltcluster/ltcluster.conf service status
   ID | Name  | Role    | Status    | Upstream | ltclusterd | PID   | Paused? | Upstream last seen
   ----+-------+---------+-----------+----------+------------+-------+---------+--------------------
   11 | cn-11 | primary | * running |          | running    | 21335 | no      | n/a

部署备节点
----------------

在15机器上部署单机版，准备作为11节点的备机（参考章节: :ref:`install_single_node` ）。

在15上创建配置文件 ``${LTHOME}/etc/ltcluster/ltcluster.conf`` :

.. code:: shell

   cat>${LTHOME}/etc/ltcluster/ltcluster.conf<<EOF
   node_id=15
   node_name='cn-15'
   conninfo='host=172.16.0.15 port=5432 user=ltcluster dbname=ltcluster connect_timeout=2'
   data_directory='${LTDATA}'
   pg_bindir='${LTHOME}/bin'
   failover='automatic'
   log_level=INFO
   log_facility=STDERR
   log_file='${LTHOME}/etc/ltcluster/ltcluster.log'
   shutdown_check_timeout=1800
   use_replication_slots=true
   promote_command='${LTHOME}/bin/ltcluster standby promote -f ${LTHOME}/etc/ltcluster/ltcluster.conf'
   follow_command='${LTHOME}/bin/ltcluster standby follow -f ${LTHOME}/etc/ltcluster/ltcluster.conf  --upstream-node-id=%n'
   EOF

执行 ``lt_ctl stop`` 停止数据库，因为15节点是作为11节点的备机，所以需要停机从11节点拷贝实例数据库。

.. code:: shell

   # 注意，这一步需要从11上拷贝完整的实例数据到15，
   # 如果数据库已经有较多数据库，拷贝耗时会较久。
   ltcluster -h 172.16.0.11 -p 5432 \
      -U ltcluster -d ltcluster \
      --log-level=DEBUG --verbose \
      -f ${LTHOME}/etc/ltcluster/ltcluster.conf standby clone -F

执行 ``lt_ctl start`` 启动数据库。

执行下面命令把ltcluster注册为standby

.. code:: shell

   ltcluster -f ${LTHOME}/etc/ltcluster/ltcluster.conf standby register

启动ltclusterd守护进程:

.. code:: shell

   ltclusterd -d \
      -f ${LTHOME}/etc/ltcluster/ltcluster.conf \
      -p ${LTHOME}/etc/ltcluster/ltcluster.pid

此时查看集群状态，CN节点的一主一备已经部署完成。

.. code:: shell

   [lightdb@fc9eb9ccfa94 ~]$ ltcluster -f ${LTHOME}/etc/ltcluster/ltcluster.conf service status
   ID | Name  | Role    | Status    | Upstream | ltclusterd | PID   | Paused? | Upstream last seen
   ----+-------+---------+-----------+----------+------------+-------+---------+--------------------
   11 | cn-11 | primary | * running |          | running    | 21335 | no      | n/a                
   15 | cn-15 | standby |   running | cn-11    | running    | 1302  | no      | 1 second(s) ago 

CN备节点介绍
----------------

默认情况下，在15上可以执行只读操作。

.. code:: shell

   lightdb@test1=# select count(*) from test_table;
   count  
   --------
   100000
   (1 row)

   lightdb@test1=# update test_table set name='abc' where id = 1;
   ERROR:  writing to worker nodes is not currently allowed
   DETAIL:  the database is read-only

在LightDB高可用的备机中，因为所有数据前部来源于流复制，所以在备机是不能直接支持写操作的。
但在分布式环境中，DML操作是分发到DN节点执行的，
所以CN备节点在启用 ``canopy.writable_standby_coordinator`` 选项后，可以在15上执行分布式表的DML操作。

.. code:: shell

   lightdb@test1=# set canopy.writable_standby_coordinator=on;
   SET
   lightdb@test1=# update test_table set name='abc' where id = 1;
   UPDATE 1

   # DDL操作依然不支持
   lightdb@test1=# create table test_table2(id int primary key);
   RROR:  cannot execute CREATE TABLE in a read-only transaction


DN节点高可用介绍
=================================

参考章节( :ref:`install_ha` ), 同样可以为DN节点添加备机，本节不再讲述部署细节。假设已经为DN节点12添加一个备节点16，形成如下架构:

.. graphviz::

   digraph foo {
      edge [ fontname="NSimSun",fontsize=10 ];
      node [ fontname="NSimSun",shape=box ];
      CN1[label="CN(11)\nprimary"];
      CN2[label="CN2(15)\nstandby"];
      DN1[label="DN(12)\nprimary"];
      DN2[label="DN(13)"];
      DN3[label="DN(14)"];
      DN11[label="DN(16)\nstandby"];
      
      CN1 -> CN2 [style=dotted];
      CN1 -> {DN1, DN2, DN3};
      CN2 -> {DN1, DN2, DN3} [style=dotted];
      DN1 -> DN11 [style=dotted];
      {rank=same; DN1, DN2, DN3};
      {rank=same; CN1, CN2};
   }


DN节点高可用异常及处理
=======================

下面我们主动停止12的数据库，等会儿后，在16上可以查看12和16高可用集群状态: 

.. code:: shell

   [lightdb@7531b42d601c ~]$ ltcluster -f $LTHOME/etc/ltcluster/ltcluster.conf service status
   ID | Name  | Role    | Status    | Upstream | ltclusterd | PID  | Paused? | Upstream last seen
   ----+-------+---------+-----------+----------+------------+------+---------+--------------------
   12 | cn-12 | primary | - failed  | ?        | n/a        | n/a  | n/a     | n/a                
   16 | cn-16 | primary | * running |          | running    | 1161 | no      | n/a                

   WARNING: following issues were detected
   - unable to  connect to node "cn-12" (ID: 12)

   HINT: execute with --verbose option to see connection error messages

此时16已经提升为primary。



此时虽然备机已经通过ltclusterd守护进程自动提升为主，但是分布式集群中的节点数据仍然指向12节点。所以执行SQL还是会失败。

.. code:: shell

   lightdb@test1=# select count(*) from test_table;
   ERROR:  connection to the remote node 172.16.0.12:5432 failed with the following error: could not connect to server: Connection refused
      Is the server running on host "172.16.0.12" and accepting
      TCP/IP connections on port 5432?

手工恢复
----------------

我们在11(CN)上把12节点的元数据改成指向16节点( ``pg_dist_node`` 表 )。

.. code:: shell

   # 查询pg_dist_node得到12节点的nodeid为2
   lightdb@test1=# select * from pg_dist_node;
   nodeid | groupid |  nodename   | nodeport | noderack | hasmetadata | isactive | noderole | nodecluster | metadatasynced | shouldhaveshards 
   --------+---------+-------------+----------+----------+-------------+----------+----------+-------------+----------------+------------------
         2 |       2 | 172.16.0.12 |     5432 | default  | t           | t        | primary  | default     | t              | t
         3 |       3 | 172.16.0.13 |     5432 | default  | t           | t        | primary  | default     | t              | t
         5 |       0 | 172.16.0.11 |     5432 | default  | t           | t        | primary  | default     | t              | f
         6 |       5 | 172.16.0.14 |     5432 | default  | t           | t        | primary  | default     | t              | t
   (4 rows)

   # 修改nodeid=2的节点数据库地址为: 172.16.0.16:5432
   lightdb@test1=# select canopy_update_node(2,'172.16.0.16',5432);
   canopy_update_node 
   --------------------
   
   (1 row)

   # 此时再次查询分布式表，已经可以查询成功。
   lightdb@test1=# select count(*) from test_table;
   count  
   --------
   100000
   (1 row)

自动恢复
----------------

根据前面分析，高可用failover后，只需要调用 ``canopy_update_node`` 更新一下元数据就可以了。我们可以通过手工或者高可用管理工具完成这个操作。

LightDB提供了 ``canopy_ha_monitor.sh`` 用于监控DN节点的failover事件。 这个脚本部署在CN上，如果有多个CN，在每个CN上面部署一个。

在启动此脚本前，需要修改脚本配置，打开脚本 ``${LTHOME}/bin/canopy_ha_monitor.sh``，修改如下参数:

.. code:: shell 

   #!/bin/bash

   # ---------------- config  start -------------------
   # 分布式数据库名称(在此数据库中必须创建)
   readonly DATABASE_NAME=postgres

   # 数据库用户名
   readonly DATABASE_USER=lightdb

   # CN节点的数据库ip地址和端口号，如果有多个CN，则配置本机CN就可以了
   readonly CN_CONNECT_INFO="172.16.0.19:61001"

   declare -A DN_HA_MAP

   # DN:  primary => standby
   # 所有DN的高可用关联信息，前面是主，后面是备
   DN_HA_MAP=(
      ["172.16.0.18:61001"]="172.16.0.18:61002"
      ["172.16.0.17:61001"]="172.16.0.17:61002"
   )

   # ---------------- config  end -------------------


运行脚本后，会连接CN节点数据库，获取所有DN节点信息，并检测数据库工作是否正常。

如果有异常，则检测对应DN节点的备库是否工作正常，是否已经切换为主模式，如果备库已经切换为主模式，则调用 canopy_update_node 函数在CN节点上修改
对应节点的地址为备库的地址。这样分布式数据库就可以正常使用。

..
   在ltclusterd执行failover时，会调用用户自定义脚本 ``failover_validation_command`` 。
   在本文测试环境中，可在配置文件 ``${LTHOME}/etc/ltcluster/ltcluster.conf`` 添加这个配置, 参考 `ltcluster手册 <http://www.light-pg.com/docs/ltcluster/current/ltclusterd-basic-configuration.html>`__ ，
   在此脚本中，可以连接CN调用 ``canopy_update_node`` 这样可以达到主备切换后，自动恢复的效果。

.. _install_patroni:

Patroni高可用部署方法
=======================

因为每个节点的高可用集群是独立的，所以我们前面部署的基础上，增加一个17节点作为13节点的备机，让13和17两个节点组成Patroni高可用集群， 以演示高可用的部署方法。
最终架构如下所示:

.. graphviz::

   digraph foo {
      edge [ fontname="NSimSun",fontsize=10 ];
      node [ fontname="NSimSun",shape=box ];
      CN1[label="CN(11)\nprimary"];
      CN2[label="CN2(15)\nstandby"];
      DN1[label="DN(12)\nprimary"];
      DN2[label="DN(13)\nprimary"];
      DN3[label="DN(14)"];
      DN11[label="DN(16)\nstandby"];
      DN21[label="DN(17)\nstandby"];
      
      CN1 -> CN2 [style=dotted];
      CN1 -> {DN1, DN2, DN3};
      CN2 -> {DN1, DN2, DN3} [style=dotted];
      DN1 -> DN11 [style=dotted];
      DN2 -> DN21 [style=dotted];
      {rank=same; DN1, DN2, DN3};
      {rank=same; CN1, CN2};
      {rank=same; DN11, DN21};
   }

部署etcd
----------------

Patroni需要依赖ETCD，我们在18机器上部署一个单机版的ETCD，在正式环境需要部署ETCD集群。

获取etcd的release包后，解压就可以使用。本文测试版本为: ``etcd-v3.4.23-linux-amd64``

.. code:: shell

   etcd --name 'etcd18' \
      --data-dir '/data/etcd' \
      --listen-client-urls 'http://0.0.0.0:2379' \
      --advertise-client-urls 'http://172.16.0.18:2379' \
      --listen-peer-urls 'http://0.0.0.0:2380' \
      --initial-advertise-peer-urls 'http://0.0.0.0:2380'  \
      --enable-v2=true

.. _install_patroni_primary:

部署Patroni主节点
------------------

获取LightDB的Patroni安装包( ``patroni-2.1.3-lightdb`` ), 在13机器上解压。

因为Patroni是基于Python开发，所以需要先有Python3环境， 然后安装如下依赖:

.. code:: shell
   
   # 先进入 patroni-2.1.3-lightdb 目录
   pip3 install --user -U pip setuptools
   pip3 install --user -r requirements.txt
   pip3 install --user psycopg

进入 ``patroni-2.1.3-lightdb`` 目录,修改配置文件 ``lightdb0.yml`` :

.. code:: yml

   # 集群名称
   scope: cluster13

   #节点名称
   name: lightdb13

   # etcd配置段中添加hosts
   hosts:
   - 172.16.0.18:2379

   # postgresql配置段
   # 修改listen,设置为机器A上实例的监听端口和IP
   listen: 127.0.0.1,172.16.0.13:5432
   connect_addr: 172.16.0.13:5432
   # 修改data_dir，同前文安装时一致
   data_dir: /data/lightdb/lightdb-x/13.8-22.4/data/defaultCluster
   # 修改superuser用户名密码, 同前文安装时保持一致,例如:
   superuser:
      username: lightdb
      password: lightdb123


登录数据库，创建Patroni需要的用户，如果您用其他的用户名和密码，需要对应修改 ``lightdb0.yml`` 配置文件。

.. code:: SQL

   CREATE USER replicator WITH replication encrypted password 'rep-pass';
   CREATE USER rewind_user WITH encrypted password 'rewind_password';


启动patroni

.. code:: shell

   ./patroni.py lightdb0.yml

此时可以通过 ``patronictl.py`` 工具查看集群状态，此时显示仅有一个Role为Leader的节点:

.. code:: shell

   [lightdb@1e631f45d1f0 patroni-2.1.3-lightdb]$ ./patronictl.py  -c ./lightdb0.yml  list
   +-----------+-------------+--------+---------+----+-----------+-----------------+
   | Member    | Host        | Role   | State   | TL | Lag in MB | Pending restart |
   + Cluster: cluster13 (7188356765573697849) --+----+-----------+-----------------+
   | lightdb13 | 172.16.0.13 | Leader | running |  2 |           | *               |
   +-----------+-------------+--------+---------+----+-----------+-----------------+

部署Patroni备节点
------------------

在备节点17上，同样参考章节( :ref:`install_patroni_primary` ), 修改 ``lightdb0.yml`` 

进入 ``patroni-2.1.3-lightdb`` 目录,修改配置文件 ``lightdb0.yml`` :

.. code:: yml

   # 集群名称
   scope: cluster13

   #节点名称
   name: lightdb17

   # etcd配置段中添加hosts
   hosts:
   - 172.16.0.18:2379

   # postgresql配置段
   # 修改listen,设置为机器A上实例的监听端口和IP
   listen: 127.0.0.1,172.16.0.17:5432
   connect_addr: 172.16.0.17:5432
   # 修改data_dir，同前文安装时一致
   data_dir: /data/lightdb/lightdb-x/13.8-22.4/data/defaultCluster
   # 修改superuser用户名密码, 同前文安装时保持一致,例如:
   superuser:
      username: lightdb
      password: lightdb123

启动patroni

.. code:: shell

   ./patroni.py lightdb0.yml

因为 ``cluster13`` 集群已经有主节点，所以17节点自动工作在备机模式。

此时通过 ``patronictl.py`` 工具查看集群状态， 可以看到patroni集群有两个节点组成。

.. code:: shell

   [lightdb@1e631f45d1f0 patroni-2.1.3-lightdb]$ ./patronictl.py  -c ./lightdb0.yml  list
   +-----------+-------------+---------+---------+----+-----------+-----------------+
   | Member    | Host        | Role    | State   | TL | Lag in MB | Pending restart |
   + Cluster: cluster13 (7188356765573697849) ---+----+-----------+-----------------+
   | lightdb13 | 172.16.0.13 | Leader  | running |  4 |           | *               |
   | lightdb17 | 172.16.0.17 | Replica | running |  4 |         0 | *               |
   +-----------+-------------+---------+---------+----+-----------+-----------------+