案例说明:
集群环境修改集群或数据库参数,需要在每个node上都要修改,在每个节点而执行修改操作,容易出现漏改或节点上参数不一致等错误;在KingbaseES V8R6的集群中增加了,一键修改参数的新功能,可以在一个节点配置后,同时修改集群中所有节点的参数,这样用户在修改集群中的参数时,避免以上错误的发生。
官方文档说明:https://help.kingbase.com.cn/stage-api/profile/document/kes/v8r6/html/highly/highly-tools/cluster-use/cluster-use-6.html#id26. 一键修改集群参数6.1. 命令行命令6.1.1. sys_monitor.sh set/set --restartsys_monitor.sh set/set --restart —一键修改集群配置和数据库配置
6.1.1.1. 描述
sys_monitor.sh set 脚本读取配置文件set_repmgr.conf和set_db.conf中的配置,分别修改集群的配置文件repmgr.conf和es_rep.conf6.1.1.2. 使用
在$db_etc目录下配置set_repmgr.conf,该配置文件中设置参数,修改集群repmgr.conf文件
例:
图 6.1.20 修改集群repmgr.conf文件
在$db_etc目录下配置set_db.conf,该配置文件中设置参数,修改集群es_rep.conf文件
例:
图 6.1.21 修改集群es_rep.conf文件6.1.1.3. 选项
以下是脚本执行参数:
set
脚本调用修改集群和数据库配置函数
set --restart
脚本调用修改集群和数据库配置函数,并重启集群。如果修改的参数必须重启数据库才能生效时,使用该项
例:
图 6.1.22 一键修改集群配置和数据库配置
例:
图 6.1.23 一键修改集群配置和数据库配置6.1.1.4. 注意事项
1.必须在集群所有节点状态正常的情况下才能执行修改参数操作
2.set_db.conf中
不可以配置port参数
不可以配置kingbase.conf中没有的参数
不可以给参数配置错误的值
3.set_repmgr.conf中
不建议修改路径相关和数据库连接串等参数,以下是适合一键修改的参数 例如:
reconnect_attempts
reconnect_interval
recovery
trust_servers
测试案例:
测试数据库版本:
test=# select version();
version
----------------------------------------------------------------------------------------------------------------------
KingbaseES V008R006C005B0041 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
(1 row)
集群node信息:
[kingbase@node102 bin]$ cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.1.101 node101 ### Standby
192.168.1.102 node102 ### Primary
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+---------+---------+-----------+----------+---------+-------+---------+--------------------
1 | node101 | standby | running | node102 | running | 15547 | no | 1 second(s) ago
2 | node102 | primary | * running | | running | 16388 | no | n/a
一、查看集群节点状态
=注意:在修改参数前,必须保证集群中所有节点是正常状态。=
[kingbase@node101 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------
1 | node101 | standby | running | node102 | default | 100 | 2 | host=192.168.1.101 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node102 | primary | * running | | default | 100 | 2 | host=192.168.1.102 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
二、修改参数测试
=在主节点的etc目录下,分别配置set_repmgr.conf和set_db.conf,将需要修改的repmgr和数据库参数及value写入到此配置文件中。=
1、查看需要修改的repmgr参数
[kingbase@node101 data]$ cat ../etc/repmgr.conf |grep -i recovery
#recovery='standby'
recovery='automatic'
[kingbase@node102 etc]$ cat repmgr.conf |grep recovery
#recovery='standby'
recovery='automatic'
2、配置repmgr参数文件(将需要修改的参数写入此配置文件,注意文件名)
[kingbase@node102 etc]$ cat set_repmgr.conf
recovery=standby
3、查看需要修改的数据库参数
[kingbase@node101 data]$ cat kingbase.conf |grep work_mem
#work_mem = 4MB # min 64kB
#maintenance_work_mem = 64MB # min 1MB
#autovacuum_work_mem = -1 # min 1MB, or -1 to use maintenance_work_mem
[kingbase@node102 bin]$ cat ../data/kingbase.conf |grep -i work_mem
#work_mem = 4MB # min 64kB
#maintenance_work_mem = 64MB # min 1MB
4、配置数据库参数文件(将需要修改的参数写入此配置文件,注意文件名)
[kingbase@node102 etc]$ cat set_db.conf
work_mem = 16MB
5、一键修改参数
=注意:如果修改的参数不需要重启就生效,可以用set参数;如果参数修改需要重启集群或数据库才能生效,需要用set --restart参数。=
[kingbase@node102 bin]$ ./sys_monitor.sh set
2022-04-06 11:58:29 Begin to set the es_rep.conf for database ...
2022-04-06 11:58:30 End to set the es_rep.conf for database ... OK
2022-04-06 11:58:30 Begin to set the repmgr.conf for repmgrd ...
2022-04-06 11:58:31 End to set the repmgr.conf for repmgrd ... OK
2022-04-06 11:58:31 Sending signal to reload the database on "192.168.1.102"
server signaled
2022-04-06 11:58:31 Sending signal to reload the database on "192.168.1.101"
server signaled
2022-04-06 11:58:32 Sending signal(SIGKILL) to stop the kbha on "192.168.1.102"
2022-04-06 11:58:34 Sending signal(SIGKILL) to stop the repmgrd on "192.168.1.102"
2022-04-06 11:58:34 begin to stop repmgrd on "[192.168.1.102]".
2022-04-06 11:58:35 repmgrd on "[192.168.1.102]" stop success.
2022-04-06 11:58:36 Start the repmgrd on "192.168.1.102"
2022-04-06 11:58:36 begin to start repmgrd on "[192.168.1.102]".
[2022-04-06 11:58:36] [NOTICE] using provided configuration file "/home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc/repmgr.conf"
[2022-04-06 11:58:36] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/hamgr.log"
2022-04-06 11:58:38 repmgrd on "[192.168.1.102]" start success.
2022-04-06 11:58:38 Start the kbha on "192.168.1.102"
[2022-04-06 11:58:41] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/kbha.log"
2022-04-06 11:58:42 Sending signal(SIGKILL) to stop the kbha on "192.168.1.101"
2022-04-06 11:58:45 Sending signal(SIGKILL) to stop the repmgrd on "192.168.1.101"
2022-04-06 11:58:45 begin to stop repmgrd on "[192.168.1.101]".
2022-04-06 11:58:46 repmgrd on "[192.168.1.101]" stop success.
2022-04-06 11:58:46 Start the repmgrd on "192.168.1.101"
2022-04-06 11:58:46 begin to start repmgrd on "[192.168.1.101]".
[2022-04-06 11:58:47] [NOTICE] using provided configuration file "/home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc/repmgr.conf"
[2022-04-06 11:58:47] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/hamgr.log"
2022-04-06 11:58:48 repmgrd on "[192.168.1.101]" start success.
2022-04-06 11:58:48 Start the kbha on "192.168.1.101"
[2022-04-06 11:58:51] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/kbha.log"
Some changes may not take effect by RELOAD, you can execute "./sys_monitor.sh restart" to restart the cluster
三、验证参数的修改
1、查看修改后的repmgr参数
如下所示:被修改的repmgr参数将追加到repmgr.conf文件的之后
[kingbase@node102 bin]$ cat ../etc/repmgr.conf |grep -i recovery
#recovery='standby'
recovery='automatic'
auto_cluster_recovery_level=1
recovery=standby ### 修改参数
2、查看数据库参数的修改
如下所示:被修改的数据库参数将追加到es_rep.conf文件的之后
[kingbase@node102 bin]$ cat ../data/es_rep.conf |grep -i work_mem
work_mem = 16MB
You have new mail in /var/spool/mail/kingbase
#数据库环境查看:
[kingbase@node102 bin]$ ./ksql -U system test
ksql (V8.0)
Type "help" for help.
test=# show work_mem;
work_mem
----------
16MB
(1 row)
四、验证重启集群参数的修改
=注意:在集群或数据库中有些参数的修改,需要重启集群或数据库服务才能生效,建议此类参数的修改,在业务低峰期间执行,防止生产环境中,业务访问受到影响。=
[kingbase@node102 bin]$ ./sys_monitor.sh set --restart
2022-04-06 12:00:23 Begin to set the es_rep.conf for database ...
2022-04-06 12:00:24 End to set the es_rep.conf for database ... OK
2022-04-06 12:00:24 Begin to set the repmgr.conf for repmgrd ...
2022-04-06 12:00:25 End to set the repmgr.conf for repmgrd ... OK
2022-04-06 12:00:25 Ready to stop all DB ...
Service process "node_export" was killed at process 8009
Service process "postgres_ex" was killed at process 8010
Service process "node_export" was killed at process 8135
Service process "postgres_ex" was killed at process 8136
2022-04-06 12:00:32 begin to stop repmgrd on "[192.168.1.102]".
2022-04-06 12:00:33 repmgrd on "[192.168.1.102]" stop success.
2022-04-06 12:00:33 begin to stop repmgrd on "[192.168.1.101]".
2022-04-06 12:00:34 repmgrd on "[192.168.1.101]" stop success.
2022-04-06 12:00:34 begin to stop DB on "[192.168.1.101]".
waiting for server to shut down.... done
server stopped
2022-04-06 12:00:34 DB on "[192.168.1.101]" stop success.
2022-04-06 12:00:34 begin to stop DB on "[192.168.1.102]".
waiting for server to shut down.... done
server stopped
2022-04-06 12:00:35 DB on "[192.168.1.102]" stop success.
2022-04-06 12:00:35 Done.
2022-04-06 12:00:35 Ready to start all DB ...
2022-04-06 12:00:35 begin to start DB on "[192.168.1.102]".
waiting for server to start.... done
server started
2022-04-06 12:00:36 execute to start DB on "[192.168.1.102]" success, connect to check it.
2022-04-06 12:00:37 DB on "[192.168.1.102]" start success.
2022-04-06 12:00:37 Try to ping trusted_servers on host 192.168.1.102 ...
2022-04-06 12:00:40 Try to ping trusted_servers on host 192.168.1.101 ...
2022-04-06 12:00:42 begin to start DB on "[192.168.1.101]".
waiting for server to start.... done
server started
2022-04-06 12:00:43 execute to start DB on "[192.168.1.101]" success, connect to check it.
2022-04-06 12:00:44 DB on "[192.168.1.101]" start success.
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------
1 | node101 | standby | running | node102 | default | 100 | 2 | host=192.168.1.101 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node102 | primary | * running | | default | 100 | 2 | host=192.168.1.102 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2022-04-06 12:00:44 The primary DB is started.
2022-04-06 12:00:44 begin to start repmgrd on "[192.168.1.102]".
[2022-04-06 12:00:45] [NOTICE] using provided configuration file "/home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc/repmgr.conf"
[2022-04-06 12:00:45] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/hamgr.log"
2022-04-06 12:00:47 repmgrd on "[192.168.1.102]" start success.
2022-04-06 12:00:47 begin to start repmgrd on "[192.168.1.101]".
[2022-04-06 12:00:49] [NOTICE] using provided configuration file "/home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc/repmgr.conf"
[2022-04-06 12:00:49] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/hamgr.log"
2022-04-06 12:00:50 repmgrd on "[192.168.1.101]" start success.
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+---------+---------+-----------+----------+---------+-------+---------+--------------------
1 | node101 | standby | running | node102 | running | 15547 | no | 1 second(s) ago
2 | node102 | primary | * running | | running | 16388 | no | n/a
[2022-04-06 12:00:53] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/kbha.log"
[2022-04-06 12:00:57] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/kbha.log"
2022-04-06 12:00:58 Done.
五、参数修改故障案例
=在修改参数时,如果只是配置了set_repmgr.conf文件,缺少set_db.conf文件,则执行一键修改时出现以下错误,需要同时创建以上两个文件。=
[kingbase@node102 bin]$ ./sys_monitor.sh set
2022-04-06 11:51:18 WRNING: there is no file "/home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc//set_db.conf", don't change the es_rep.conf for database
2022-04-06 11:51:18 WRNING: there is no file "/home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc//set_repmgr.conf", don't change the repmgr.conf for repmgrd
2022-04-06 11:51:18 ERROR: the file "/home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc//set_db.conf" and "/home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc//set_repmgr.conf" do not exist, exit
六、总结
对于KingbaseES V8R6集群一键修改参数的功能,方便了集群的维护和管理,但注意,并不是所有参数都能修改,具体可以参考官方文档说明。
所有评论(0)