2.GaussDB分布式数据库部署
这里需要注意,配置文件尽量不要手工编辑,容易出错。在GaussDB,OceanBase,TiDB等分布式数据库部署时,配置文件都是严格按照对齐方式的,手工编辑的会出现多次失败,直到配置信息正确。执行如上两个命令能够有效方式,在部署结束后集群启动失败,600s超时的问题。集中式部署有集中式部署的包,分布式有分布式部署的包。(2)分布式数据库的部署,尽量使用数据库自带的配置文件进行修改,手工修改容易出
1.安装包准备
Kylin-Server-10-SP2-x86-Release-Build09-20210524.iso 操作系统采用麒麟V10
GaussDBInstaller_1.0.5.6_20230630015648.tar.gz 数据库安装工具包。
GaussDB_X86_Kylinv10_Distributed_2.23.01.230_20230705094619.tar.gz 基于麒麟V10的分布式数据库安装包。
需要注意:华为GaussDB的安装包,不支个人下载,只有购买服务或者作为合作方才可以下载使用。集中式部署有集中式部署的包,分布式有分布式部署的包。这里我们使用分布式。
2.创建数据库管理用户
useradd omm
passwd omm
groupadd dbgrp
echo Gauss_2xx|passwd omm --stdin
3.检查磁盘容量
[root@gaussdb01 GaussDB]# df -h
文件系统 容量 已用 可用 已用% 挂载点
devtmpfs 7.2G 0 7.2G 0% /dev
tmpfs 7.2G 12K 7.2G 1% /dev/shm
tmpfs 7.2G 9.6M 7.2G 1% /run
tmpfs 7.2G 0 7.2G 0% /sys/fs/cgroup
/dev/mapper/klas-root 52G 11G 42G 21% /
tmpfs 7.2G 4.0K 7.2G 1% /tmp
/dev/sda1 1014M 214M 801M 22% /boot
tmpfs 1.5G 0 1.5G 0% /run/user/993
tmpfs 1.5G 0 1.5G 0% /run/user/0
/dev/mapper/datavg-datalv 95G 61M 90G 1% /data
磁盘空间大于50G就可以用于测试。
4.我们的IP和网卡信息如下
192.168.1.52 192.168.1.53 192.168.1.54 数据库管理IP ,网卡ens33
192.168.2.52 192.168.2.53 192.168.2.54 DATA IP ,网卡ens39
192.168.3.52 192.168.3.53 192.168.3.54 VIRTUAL IP ,网卡ens34
gaussdb01,gaussdb02,gaussdb03 主机名称
5.数据库安装工具解压
[root@gaussdb01 GaussDB]# tar zxvf GaussDBInstaller_1.0.5.6_20230630015648.tar.gz -C /data
GaussDBInstaller/
GaussDBInstaller/install_cluster.conf
GaussDBInstaller/ReadMe.txt
GaussDBInstaller/gaussdb_install.py
GaussDBInstaller/pkgDir/
GaussDBInstaller/pkgDir/ReadMe.txt
GaussDBInstaller/jsonFileSample/
GaussDBInstaller/jsonFileSample/3_nodes_centralized_paxos.json
GaussDBInstaller/jsonFileSample/3_nodes_distributed.json
GaussDBInstaller/jsonFileSample/4_nodes_distributed_4shards.json
GaussDBInstaller/jsonFileSample/1_node.json
GaussDBInstaller/jsonFileSample/5_nodes_distributed.json
GaussDBInstaller/jsonFileSample/2_nodes_centralized_1primary_1standby_1logger.json
GaussDBInstaller/jsonFileSample/4_nodes_distributed_8shards.json
GaussDBInstaller/jsonFileSample/5_nodes_centralized.json
GaussDBInstaller/jsonFileSample/9_nodes_distributed_8shards.json
GaussDBInstaller/jsonFileSample/9_nodes_distributed_4shards.json
GaussDBInstaller/jsonFileSample/3_nodes_centralized.json
GaussDBInstaller/jsonFileSample/3_nodes_centralized_1primary_1standby_1logger.json
GaussDBInstaller/install_cluster.sh
注意是把工具包解压到/data目录下。
查看解压后的内容
[root@gaussdb01 data]# cd GaussDBInstaller/
[root@gaussdb01 GaussDBInstaller]# ll
总用量 84
-rw------- 1 root root 37396 6月 30 01:56 gaussdb_install.py
-rw------- 1 root root 603 6月 30 01:56 install_cluster.conf
-rw------- 1 root root 28363 6月 30 01:56 install_cluster.sh
drwx------ 2 root root 4096 6月 30 01:56 jsonFileSample
drwx------ 2 root root 4096 6月 30 01:56 pkgDir
-rw------- 1 root root 3344 6月 30 01:56 ReadMe.txt
6.解压数据库安装包
将GaussDB的数据库解压到:/data/GaussDBInstaller/pkgDir/ 目录下
[root@gaussdb01 GaussDB]# tar zxvf GaussDB_X86_Kylinv10_Distributed_2.23.01.230_20230705094619.tar.gz -C /data/GaussDBInstaller/pkgDir/
DBS-GaussDB-Adaptor_2.23.01.230.1688550379.tar.gz
DBS-GaussDB-Adaptor_2.23.01.230.1688550379.tar.gz.md5
GaussDB-Kernel_503.1.0.SPC1200.B018_Om_X86_Distributed.tar.gz
GaussDB-Kernel_503.1.0.SPC1200.B018_Om_X86_Distributed.tar.gz.md5
GaussDB-Kernel_503.1.0.SPC1200.B018_Server_X86_Distributed.tar.gz
GaussDB-Kernel_503.1.0.SPC1200.B018_Server_X86_Distributed.tar.gz.md5
[root@gaussdb01 GaussDB]#
可以看到有三个组件:Adaptor ,Om,Server
7.部署分布式数据库所使用的分布式json配置文件
这里需要注意,配置文件尽量不要手工编辑,容易出错。在GaussDB,OceanBase,TiDB等分布式数据库部署时,配置文件都是严格按照对齐方式的,手工编辑的会出现多次失败,直到配置信息正确。
下面我们来看一个手工编辑配置文件的部署情况
(1)配置文件install_cluster.json 的配置信息如下
vi install_cluster.json
[root@gaussdb01 GaussDBInstaller]# cat install_cluster.json
{
"rdsAdminUser":"rdsAdmin",
"rdsAdminPasswd":"Gauss_123",
"rdsMetricUser":"rdsMetric",
"rdsMetricPasswd":"Gauss_123",
"rdsReplUser":"rdsRepl",
"rdsReplPasswd":"Gauss_123",
"rdsBackupUser":"rdsBackup",
"rdsBackupPasswd":"Gauss_123",
"dbPort":"8000",
"dbUser":"root",
"dbUserPasswd":"Gauss_123",
"clusterMode":"combined",
"params":{
"enable_thread_pool":"on",
"enable_bbox_dump":"on",
"bbox_dump_path":"/home/core"
},
"cnParams":{
},
"dnParams":{
},
"cmParams":{
},
"clusterConf":{
"clusterName":"Gauss_3Cluster",
"encoding": "utf8",
"shardingNum": 3,
"replicaNum": 3,
"solution":"hws",
"cm":[
{
"rack": "gauss01",
"az": "AZ1",
"ip": "192.168.1.52",
"dataIp":"192.168.2.52",
"virtualIp":"192.168.3.52"
},
{
"rack": "gauss02",
"az": "AZ2",
"ip": "192.168.1.53",
"dataIp":"192.168.2.53",
"virtualIp":"192.168.3.53"
},
{
"rack": "gauss03",
"az": "AZ3",
"ip": "192.168.1.54",
"dataIp":"192.168.2.54",
"virtualIp":"192.168.3.54"
}
],
"cn":[
{
"rack": "gauss01",
"az": "AZ1",
"ip": "192.168.1.52",
"dataIp":"192.168.2.52",
"virtualIp":"192.168.3.52"
},
{
"rack": "gauss02",
"az": "AZ2",
"ip": "192.168.1.53",
"dataIp":"192.168.2.53",
"virtualIp":"192.168.3.53"
},
{
"rack": "gauss03",
"az": "AZ3",
"ip": "192.168.1.54",
"dataIp":"192.168.2.54",
"virtualIp":"192.168.3.54"
}
],
"gtm":[
{
"rack": "gauss01",
"az": "AZ1",
"ip": "192.168.1.52",
"dataIp":"192.168.2.52",
"virtualIp":"192.168.3.52"
},
{
"rack": "gauss02",
"az": "AZ2",
"ip": "192.168.1.53",
"dataIp":"192.168.2.53",
"virtualIp":"192.168.3.53"
},
{
"rack": "gauss03",
"az": "AZ3",
"ip": "192.168.1.54",
"dataIp":"192.168.2.54",
"virtualIp":"192.168.3.54"
}
],
"shards":[
[
{
"rack": "gauss01",
"az": "AZ1",
"ip": "192.168.1.52",
"dataIp":"192.168.2.52",
"virtualIp":"192.168.3.52"
},
{
"rack": "gauss02",
"az": "AZ2",
"ip": "192.168.1.53",
"dataIp":"192.168.2.53",
"virtualIp":"192.168.3.53"
},
{
"rack": "gauss03",
"az": "AZ3",
"ip": "192.168.1.54",
"dataIp":"192.168.2.54",
"virtualIp":"192.168.3.54"
}
],
[
{
"rack": "gauss02",
"az": "AZ2",
"ip": "192.168.1.53",
"dataIp":"192.168.2.53",
"virtualIp":"192.168.3.53"
},
{
"rack": "gauss01",
"az": "AZ1",
"ip": "192.168.1.52",
"dataIp":"192.168.2.52",
"virtualIp":"192.168.3.52"
},
{
"rack": "gauss03",
"az": "AZ3",
"ip": "192.168.1.54",
"dataIp":"192.168.2.54",
"virtualIp":"192.168.3.54"
}
],
[
{
"rack": "gauss03",
"az": "AZ3",
"ip": "192.168.1.54",
"dataIp":"192.168.2.54",
"virtualIp":"192.168.3.54"
},
{
"rack": "gauss01",
"az": "AZ1",
"ip": "192.168.1.52",
"dataIp":"192.168.2.52",
"virtualIp":"192.168.3.52"
},
{
"rack": "gauss02",
"az": "AZ1",
"ip": "192.168.1.53",
"dataIp":"192.168.2.53",
"virtualIp":"192.168.3.53"
}
]
],
"etcd":{
"nodes":[
{
"rack": "gauss01",
"az": "AZ1",
"ip": "192.168.1.52",
"dataIp":"192.168.2.52",
"virtualIp":"192.168.3.52"
},
{
"rack": "gauss02",
"az": "AZ2",
"ip": "192.168.1.53",
"dataIp":"192.168.2.53",
"virtualIp":"192.168.3.53"
},
{
"rack": "gauss03",
"az": "AZ3",
"ip": "192.168.1.54",
"dataIp":"192.168.2.54",
"virtualIp":"192.168.3.54"
}
]
}
}
}
[root@gaussdb01 GaussDBInstaller]# more install_cluster.conf
[COMMON]
os_user = omm
os_user_group = ${os_user}
os_user_home = /home/${os_user}
os_user_passwd = Gauss_234
root_passwd = Root-Xsq
ssh_port = 22
node_ip_list = 192.168.1.52,192.168.1.53,192.168.1.54
[OMAGENT]
gauss_home = /data/cluster
om_agent_port = 30170
mgr_net =
data_net =
virtual_net =
log_dir = ${gauss_home}/logs/gaussdb
cn_dir = ${gauss_home}/data/cn
gtm_dir = ${gauss_home}/data/gtm
cm_dir = ${gauss_home}/data/cm
tmp_dir = ${gauss_home}/temp
data_dir = ${gauss_home}/data/dn
tool_dir = ${gauss_home}/tools
etcd_dir = ${gauss_home}/data/etcd
配置信息看起来没有问题,接下来我们部署,看看手工配置的的信息能不能部署成功。
python3 gaussdb_install.py --action main
[root@gaussdb01 GaussDBInstaller]# python3 gaussdb_install.py --action main
[2023-08-28 18:11:12][root][INFO]:Start to execute cmd: sh /data/GaussDBInstaller/install_cluster.sh getCurrentIp 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:12][root][INFO]:End to execute cmd sh /data/GaussDBInstaller/install_cluster.sh getCurrentIp 192.168.1.52 192.168.1.53 192.168.1.54
192.168.1.53 192.168.1.54
[2023-08-28 18:11:12][root][INFO]:Start to execute preProcess...
[2023-08-28 18:11:12][root][INFO]:SERVER_PACKAGE exists
[2023-08-28 18:11:12][root][INFO]:ADAPTOR_PACKAGE exists
[2023-08-28 18:11:12][root][INFO]:AGENT_PACKAGE exists
[2023-08-28 18:11:12][root][INFO]:Integrity check passed
[2023-08-28 18:11:12][root][INFO]:Start to execute cmd: mkdir -p /data/cluster
[2023-08-28 18:11:12][root][INFO]:End to execute cmd mkdir -p /data/cluster
[2023-08-28 18:11:12][root][INFO]:Start check json file
[2023-08-28 18:11:12][root][INFO]:Start to execute cmd: cat /data/GaussDBInstaller/install_cluster.json
[2023-08-28 18:11:12][root][INFO]:End to execute cmd cat /data/GaussDBInstaller/install_cluster.json
[2023-08-28 18:11:12][root][INFO]:preProcess start check disk capacity
[2023-08-28 18:11:12][root][INFO]:Start to execute cmd: df -h /data/cluster |grep "/" |awk '{print $4}' | awk '$1=$1'
[2023-08-28 18:11:12][root][INFO]:End to execute cmd df -h /data/cluster |grep "/" |awk '{print $4}' | awk '$1=$1'
[2023-08-28 18:11:12][root][INFO]:Start check json file
[2023-08-28 18:11:12][root][INFO]:Start to execute cmd: cat /data/GaussDBInstaller/install_cluster.json
[2023-08-28 18:11:12][root][INFO]:End to execute cmd cat /data/GaussDBInstaller/install_cluster.json
[2023-08-28 18:11:12][root][INFO]:mgr_net_ip = 192.168.1.52, data_net_ip= 192.168.2.52, virtual_net_ip= 192.168.3.52
[2023-08-28 18:11:12][root][INFO]:preProcess start check SSH connection
[2023-08-28 18:11:12][root][INFO]:Start to execute cmd: sh /data/GaussDBInstaller/install_cluster.sh ssh_connection root 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:13][root][INFO]:End to execute cmd sh /data/GaussDBInstaller/install_cluster.sh ssh_connection root 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:13][root][INFO]:Start check openssl expect ifconfig
[2023-08-28 18:11:13][root][INFO]:Start to execute cmd: sh /data/GaussDBInstaller/install_cluster.sh check_cmd root 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:15][root][INFO]:End to execute cmd sh /data/GaussDBInstaller/install_cluster.sh check_cmd root 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:15][root][INFO]:PreProcess task is processing, Please see /data/GaussDBInstaller/install_cluster.log for details
[2023-08-28 18:11:15][root][INFO]:Start to execute cmd: sh /data/GaussDBInstaller/install_cluster.sh preProcess 192.168.1.53 192.168.1.54
[2023-08-28 18:11:32][root][INFO]:End to execute cmd sh /data/GaussDBInstaller/install_cluster.sh preProcess 192.168.1.53 192.168.1.54
[2023-08-28 18:11:32][root][INFO]:preProcess dynamic library check start in host:192.168.1.52
[2023-08-28 18:11:32][root][WARNING]:preProcess dynamic library check is complete in host:192.168.1.52 [Not Found]:['libavcodec.so.58', 'libavformat.so.58', 'libawt.so', 'libawt_xawt.so', 'libffi.so.6', 'libgstreamer-lite.so', 'libiperf.so.0', 'libjava.so', 'libjli.so', 'libjvm.so', 'libnet.so', 'libnio.so', 'libnsl.so.1', 'libverify.so']
[2023-08-28 18:11:32][root][INFO]:Start check the tmp_dir directory is empty
[2023-08-28 18:11:32][root][INFO]:Start to execute cmd: sh /data/GaussDBInstaller/install_cluster.sh check_tmp_dir root 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:34][root][INFO]:End to execute cmd sh /data/GaussDBInstaller/install_cluster.sh check_tmp_dir root 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:34][root][INFO]:check the tmp_dir directory sucess
[2023-08-28 18:11:34][root][INFO]:End to execute preProcess Sucess...
preProcess successfully.
[2023-08-28 18:11:34][root][INFO]:Start to execute setEnv, ipList is 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:34][root][INFO]:Start to execute setOSEnv, ipList is 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:34][root][INFO]:Start to execute cmd: sh /data/GaussDBInstaller/install_cluster.sh executeRemote setOSEnv 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:35][root][INFO]:End to execute cmd sh /data/GaussDBInstaller/install_cluster.sh executeRemote setOSEnv 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:35][root][INFO]:SetOSEnv task is processing, Please see /data/GaussDBInstaller/install_cluster.log for details
[2023-08-28 18:11:35][root][INFO]:End to execute setOSEnv...
[2023-08-28 18:11:35][root][INFO]:Start to execute setUserEnv, ipList is 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:35][root][INFO]:Start to execute cmd: sh /data/GaussDBInstaller/install_cluster.sh executeRemote setUserEnv 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:37][root][INFO]:End to execute cmd sh /data/GaussDBInstaller/install_cluster.sh executeRemote setUserEnv 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:37][root][INFO]:Start to execute cmd: sh /data/GaussDBInstaller/install_cluster.sh check_user_ssh_connection omm 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:39][root][INFO]:End to execute cmd sh /data/GaussDBInstaller/install_cluster.sh check_user_ssh_connection omm 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:39][root][INFO]:SetUserEnv task is processing, Please see /data/GaussDBInstaller/install_cluster.log for details
[2023-08-28 18:11:39][root][INFO]:End to execute setUserEnv...
[2023-08-28 18:11:39][root][INFO]:End to execute setEnv...
setEnv successfully.
[2023-08-28 18:11:39][root][INFO]:Start to execute genCertificate...
[2023-08-28 18:11:39][root][INFO]:Start to execute cmd: sh /data/GaussDBInstaller/install_cluster.sh genCertificate 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:41][root][INFO]:End to execute cmd sh /data/GaussDBInstaller/install_cluster.sh genCertificate 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:41][root][INFO]:End to execute genCertificate Sucess...
[2023-08-28 18:11:41][root][INFO]:Start check genCertificate status...
[2023-08-28 18:11:41][root][INFO]:Start to execute cmd: sh /data/GaussDBInstaller/install_cluster.sh check_genCertificate omm 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:42][root][INFO]:End to execute cmd sh /data/GaussDBInstaller/install_cluster.sh check_genCertificate omm 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:42][root][INFO]:check genCertificate status sucess
The certificate is successfully pushed
[2023-08-28 18:11:42][root][INFO]:Start to execute cmd: sh /data/GaussDBInstaller/install_cluster.sh getCurrentIp 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:42][root][INFO]:End to execute cmd sh /data/GaussDBInstaller/install_cluster.sh getCurrentIp 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:11:42][root][INFO]:Start to execute installOmAgent...
[2023-08-28 18:11:42][root][INFO]:Start to execute cmd: sh /data/GaussDBInstaller/install_cluster.sh executeRemoteUser installOmAgent 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:12:06][root][INFO]:End to execute cmd sh /data/GaussDBInstaller/install_cluster.sh executeRemoteUser installOmAgent 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:12:06][root][INFO]:End to execute installOmAgent...
[2023-08-28 18:12:06][root][INFO]:Start check OmAgent status
[2023-08-28 18:12:06][root][INFO]:Start to execute cmd: sh /data/GaussDBInstaller/install_cluster.sh check_omAgent omm 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:12:07][root][INFO]:End to execute cmd sh /data/GaussDBInstaller/install_cluster.sh check_omAgent omm 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:12:07][root][INFO]:OmAgent status is normal
192.168.1.53 192.168.1.54
The OmAgent is installed successfully.
[2023-08-28 18:12:07][root][INFO]:Start to execute cmd: sh /data/GaussDBInstaller/install_cluster.sh getCurrentIp 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:12:07][root][INFO]:End to execute cmd sh /data/GaussDBInstaller/install_cluster.sh getCurrentIp 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:12:07][root][INFO]:Start to execute preInstallCluster in local host 192.168.1.52
[2023-08-28 18:12:07][root][INFO]:Start to execute cmd: sh /data/GaussDBInstaller/install_cluster.sh executeRemoteUser preInstallClusterLocal 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:14:07][root][INFO]:End to execute cmd sh /data/GaussDBInstaller/install_cluster.sh executeRemoteUser preInstallClusterLocal 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:14:07][root][INFO]:End to execute preInstallCluster...
192.168.1.53 192.168.1.54
{"detailmsg": "SUCCESS", "retcode": 0}
{"detailmsg": "SUCCESS", "retcode": 0}
{"detailmsg": "SUCCESS", "retcode": 0}
preInstallCluster installation is successful.
[2023-08-28 18:14:07][root][INFO]:Start to execute cmd: sh /data/GaussDBInstaller/install_cluster.sh getCurrentIp 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:14:07][root][INFO]:End to execute cmd sh /data/GaussDBInstaller/install_cluster.sh getCurrentIp 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:14:07][root][INFO]:Start to execute installCluster...
[2023-08-28 18:14:08][root][INFO]:Start to execute cmd: sh /data/GaussDBInstaller/install_cluster.sh getCurrentIp 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:14:08][root][INFO]:End to execute cmd sh /data/GaussDBInstaller/install_cluster.sh getCurrentIp 192.168.1.52 192.168.1.53 192.168.1.54
[2023-08-28 18:14:08][root][INFO]:Start to execute installCluster...
[2023-08-28 18:14:08][root][INFO]:Start check json file
[2023-08-28 18:14:08][root][INFO]:Start to execute cmd: cat /data/GaussDBInstaller/install_cluster.json
[2023-08-28 18:14:08][root][INFO]:End to execute cmd cat /data/GaussDBInstaller/install_cluster.json
[2023-08-28 18:14:08][root][INFO]:The install cluster cmd is echo '{"dbPort": "8000", "nodeIp": "192.168.1.52", "rdsAdminUser": "rdsAdmin", "rdsAdminPasswd": "Gauss_123", "rdsMetricUser": "rdsMetric", "rdsMetricPasswd": "Gauss_123", "rdsReplUser": "rdsRepl", "rdsReplPasswd": "Gauss_123", "rdsBackupUser": "rdsBackup", "rdsBackupPasswd": "Gauss_123", "dbUser": "root", "dbUserPasswd": "Gauss_123", "params": {"enable_thread_pool": "on", "enable_bbox_dump": "on", "bbox_dump_path": "/home/core"}, "cnParams": {}, "dnParams": {}, "cmParams": {}, "ca_crt_filename": "ca.pem", "server_crt_filename": "server.pem", "server_key_filename": "server.key", "ssl_cert_passwd": "Gauss_2xx", "ssl_cert_path": "/home/omm/sslcrt", "enableForceSwitch": null, "clusterMode": "combined", "func_name": "install"}' | python3 /data/cluster/adaptor/om_controller
[2023-08-28 19:16:31][root][INFO]:The result of install is {"retcode": 1, "detailmsg": "Fail to install, Error: [GAUSS-52503] : Failed to execute checkRunStatus.py. Error: Check the current task [installation] failed. Error:\n[GAUSS-51637] : The valid return item number [2] does not match with host number[3]. The return result:\n192.168.1.54: PID COMMAND\n 69002 /usr/lib/systemd/systemd --user\n 69003 (sd-pam)\n 69170 python3 /home/omm/dbs/om_agent/agent_256e52f5/om_agent.py\n 77959 /data/cluster/core/app/bin/om_monitor -L /data/cluster/logs/gaussdb/omm/cm/om_monitor\n 88369 sh -\n 88445 ps -u omm -o pid,command\n[1] 19:11:32 [SUCCESS] 192.168.1.54\n192.168.1.53: PID COMMAND\n 68180 /usr/lib/systemd/systemd --user\n 68181 (sd-pam)\n 68347 python3 /home/omm/dbs/om_agent/agent_256e52f5/om_agent.py\n 77137 /data/cluster/core/app/bin/om_monitor -L /data/cluster/logs/gaussdb/omm/cm/om_monitor\n 87542 sh -\n 87618 ps -u omm -o pid,command\n[2] 19:11:32 [SUCCESS] 192.168.1.53\n."}
[2023-08-28 19:16:31][root][ERROR]:InstallCluster in local host 192.168.1.52 execute failed, Error: {"retcode": 1, "detailmsg": "Fail to install,
Error: [GAUSS-52503] : Failed to execute checkRunStatus.py. Error: Check the current task [installation] failed.
Error:\n[GAUSS-51637] : The valid return item number [2] does not match with host number[3]. The return result:\n192.168.1.54: PID COMMAND\n 69002 /usr/lib/systemd/systemd --user\n 69003 (sd-pam)\n 69170 python3 /home/omm/dbs/om_agent/agent_256e52f5/om_agent.py\n 77959 /data/cluster/core/app/bin/om_monitor -L /data/cluster/logs/gaussdb/omm/cm/om_monitor\n 88369 sh -\n 88445 ps -u omm -o pid,command\n[1] 19:11:32 [SUCCESS] 192.168.1.54\n192.168.1.53: PID COMMAND\n 68180 /usr/lib/systemd/systemd --user\n 68181 (sd-pam)\n 68347 python3 /home/omm/dbs/om_agent/agent_256e52f5/om_agent.py\n 77137 /data/cluster/core/app/bin/om_monitor -L /data/cluster/logs/gaussdb/omm/cm/om_monitor\n 87542 sh -\n 87618 ps -u omm -o pid,command\n[2] 19:11:32 [SUCCESS] 192.168.1.53\n."}
installCluster installation faild
果然到最后部署失败,且出现:GAUSS-51637:The valid return item number [2] does not match with host number[3]
这个报错的主要原因就是配置信息可能有问题。
(2)使用模版文件编辑并进行部署
当我们使用数据库提供的模版文件进行编辑,看看部署情况
python3 gaussdb_install.py --action uninstall_cleanup --清理集群
[root@gaussdb01 jsonFileSample]# pwd
/data/GaussDBInstaller/jsonFileSample
拷贝一份模版文件
[root@gaussdb01 jsonFileSample]# cp 3_nodes_distributed.json ../install_cluster.json
vi /data/GaussDBInstaller/install_cluster.json
重新将里面的内容替换:将AZ1,AZ2,AZ3 替换为我们规划的内容。
"az": "AZ1",
"ip": "192.168.1.52",
"dataIp":"192.168.2.52",
"virtualIp":"192.168.3.52"
"az": "AZ2",
"ip": "192.168.1.53",
"dataIp":"192.168.2.53",
"virtualIp":"192.168.3.53"
"rack": "gauss003",
"az": "AZ3",
"ip": "192.168.1.54",
"dataIp":"192.168.2.54",
"virtualIp":"192.168.3.54"
编辑后的模版内容如下:
[omm@gaussdb01 GaussDBInstaller]$ cat install_cluster.json
{
"rdsAdminUser":"rdsAdmin",
"rdsAdminPasswd":"Gauss_123",
"rdsMetricUser":"rdsMetric",
"rdsMetricPasswd":"Gauss_123",
"rdsReplUser":"rdsRepl",
"rdsReplPasswd":"Gauss_123",
"rdsBackupUser":"rdsBackup",
"rdsBackupPasswd":"Gauss_123",
"dbPort":"8000",
"dbUser":"root",
"dbUserPasswd":"Gauss_123",
"clusterMode":"combined",
"params":{
"enable_thread_pool":"on",
"enable_bbox_dump":"on",
"bbox_dump_path":"/home/core"
},
"cnParams":{
},
"dnParams":{
},
"cmParams":{
},
"clusterConf":{
"clusterName":"Gauss_XuanYuan",
"encoding": "utf8",
"shardingNum": 3,
"replicaNum": 3,
"solution":"hws",
"cm":[
{
"rack": "gauss001",
"az": "AZ1",
"ip": "192.168.1.52",
"dataIp":"192.168.2.52",
"virtualIp":"192.168.3.52"
},
{
"rack": "gauss002",
"az": "AZ2",
"ip": "192.168.1.53",
"dataIp":"192.168.2.53",
"virtualIp":"192.168.3.53"
},
{
"rack": "gauss003",
"az": "AZ3",
"ip": "192.168.1.54",
"dataIp":"192.168.2.54",
"virtualIp":"192.168.3.54"
}
],
"cn":[
{
"rack": "gauss001",
"az": "AZ1",
"ip": "192.168.1.52",
"dataIp":"192.168.2.52",
"virtualIp":"192.168.3.52"
},
{
"rack": "gauss002",
"az": "AZ2",
"ip": "192.168.1.53",
"dataIp":"192.168.2.53",
"virtualIp":"192.168.3.53"
},
{
"rack": "gauss003",
"az": "AZ3",
"ip": "192.168.1.54",
"dataIp":"192.168.2.54",
"virtualIp":"192.168.3.54"
}
],
"gtm":[
{
"rack": "gauss001",
"az": "AZ1",
"ip": "192.168.1.52",
"dataIp":"192.168.2.52",
"virtualIp":"192.168.3.52"
},
{
"rack": "gauss002",
"az": "AZ2",
"ip": "192.168.1.53",
"dataIp":"192.168.2.53",
"virtualIp":"192.168.3.53"
},
{
"rack": "gauss003",
"az": "AZ3",
"ip": "192.168.1.54",
"dataIp":"192.168.2.54",
"virtualIp":"192.168.3.54"
}
],
"shards":[
[
{
"rack": "gauss001",
"az": "AZ1",
"ip": "192.168.1.52",
"dataIp":"192.168.2.52",
"virtualIp":"192.168.3.52"
},
{
"rack": "gauss002",
"az": "AZ2",
"ip": "192.168.1.53",
"dataIp":"192.168.2.53",
"virtualIp":"192.168.3.53"
},
{
"rack": "gauss003",
"az": "AZ3",
"ip": "192.168.1.54",
"dataIp":"192.168.2.54",
"virtualIp":"192.168.3.54"
}
],
[
{
"rack": "gauss002",
"az": "AZ2",
"ip": "192.168.1.53",
"dataIp":"192.168.2.53",
"virtualIp":"192.168.3.53"
},
{
"rack": "gauss001",
"az": "AZ1",
"ip": "192.168.1.52",
"dataIp":"192.168.2.52",
"virtualIp":"192.168.3.52"
},
{
"rack": "gauss003",
"az": "AZ3",
"ip": "192.168.1.54",
"dataIp":"192.168.2.54",
"virtualIp":"192.168.3.54"
}
],
[
{
"rack": "gauss003",
"az": "AZ3",
"ip": "192.168.1.54",
"dataIp":"192.168.2.54",
"virtualIp":"192.168.3.54"
},
{
"rack": "gauss002",
"az": "AZ2",
"ip": "192.168.1.53",
"dataIp":"192.168.2.53",
"virtualIp":"192.168.3.53"
},
{
"rack": "gauss001",
"az": "AZ1",
"ip": "192.168.1.52",
"dataIp":"192.168.2.52",
"virtualIp":"192.168.3.52"
}
]
],
"etcd":{
"nodes":[
{
"rack": "gauss001",
"az": "AZ1",
"ip": "192.168.1.52",
"dataIp":"192.168.2.52",
"virtualIp":"192.168.3.52"
},
{
"rack": "gauss002",
"az": "AZ2",
"ip": "192.168.1.53",
"dataIp":"192.168.2.53",
"virtualIp":"192.168.3.53"
},
{
"rack": "gauss003",
"az": "AZ3",
"ip": "192.168.1.54",
"dataIp":"192.168.2.54",
"virtualIp":"192.168.3.54"
}
]
}
}
}
重新执行部署命令,看看是否能够成功。
--再次运行安装成功。
python3 gaussdb_install.py --action main
--1号节点:dn_6009 处于等待状态。
[root@gaussdb01 dn_6009]# ps -ef |grep omm |grep datanode
omm 252343 1 8 22:35 ? 00:02:02 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6009 -M pending
omm 260172 1 6 22:37 ? 00:01:24 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6001 -M standby
omm 267743 1 16 22:40 ? 00:03:25 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6005 -M standby
--2号节点全部变为备。
[root@gaussdb02 dn]# ps -ef |grep omm |grep datanode
omm 216054 1 5 22:38 ? 00:01:14 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6008 -M standby
omm 217523 1 18 22:39 ? 00:04:00 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6004 -M standby
omm 217734 1 5 22:39 ? 00:01:11 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6002 -M standby
--3号节点全部都是pending状态。
[root@gaussdb03 dn]# ps -ef |grep omm |grep datanode
omm 205282 1 10 22:35 ? 00:02:38 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6003 -M pending
omm 205303 1 10 22:35 ? 00:02:54 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6006 -M pending
omm 308174 1 0 23:01 ? 00:00:00 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6007 -M pending
继续等待。。。。。。。
[2023-08-28 23:17:14][root][ERROR]:InstallCluster in local host 192.168.1.52 execute failed, Error: {"retcode": 1, "detailmsg":
"Fail to install, Error: Result exception error : Failed to start cluster.
Error: cm_ctl: starting the ETCD cluster. .
cm_ctl: the ETCD cluster starts successfully.
cm_ctl: checking cluster status.
cm_ctl: checking cluster status.
cm_ctl: checking finished in 2885 ms.
cm_ctl: start cluster.
cm_ctl: start nodeid: 1
cm_ctl: start nodeid: 2
cm_ctl: start nodeid: 3
cm_ctl: start cluster failed in (600)s."}
可以看到集群已经部署了,但是启动失败了。最后报了:cm_ctl: start cluster failed in (600)s.
集群启动600s超时。原因是:集群启动需要的内存不足。我们手工进行修改:
在任意一个节点执行以下命令
[omm@gaussdb01 ~]$ gs_guc set -Z coordinator -Z datanode -N all -I all -c "shared_buffers = 1024MB"
The gs_guc run with the following arguments: [gs_guc -Z coordinator -Z datanode -N all -I all -c shared_buffers = 512MB set ].
Begin to perform the total nodes: 3.
Popen count is 3, Popen success count is 3, Popen failure count is 0.
Begin to perform gs_guc for coordinators.
Command count is 3, Command success count is 3, Command failure count is 0.
Total instances: 3. Failed instances: 0.
ALL: Success to perform gs_guc!
Begin to perform the total nodes: 3.
Popen count is 3, Popen success count is 3, Popen failure count is 0.
Begin to perform gs_guc for datanodes.
Command count is 3, Command success count is 3, Command failure count is 0.
Total instances: 9. Failed instances: 0.
ALL: Success to perform gs_guc!
[omm@gaussdb01 ~]$ gs_guc set -Z coordinator -Z datanode -N all -I all -c "max_process_memory = 3GB"
The gs_guc run with the following arguments: [gs_guc -Z coordinator -Z datanode -N all -I all -c max_process_memory = 2GB set ].
Begin to perform the total nodes: 3.
Popen count is 3, Popen success count is 3, Popen failure count is 0.
Begin to perform gs_guc for coordinators.
Command count is 3, Command success count is 3, Command failure count is 0.
Total instances: 3. Failed instances: 0.
ALL: Success to perform gs_guc!
Begin to perform the total nodes: 3.
Popen count is 3, Popen success count is 3, Popen failure count is 0.
Begin to perform gs_guc for datanodes.
Command count is 3, Command success count is 3, Command failure count is 0.
Total instances: 9. Failed instances: 0.
ALL: Success to perform gs_guc!
gs_guc set -Z coordinator -Z datanode -N all -I all -c "shared_buffers = 1024MB"
gs_guc set -Z coordinator -Z datanode -N all -I all -c "max_process_memory = 3GB"
主要是执行如上两个语句。内存调整后我们再次启动集群。
[omm@gaussdb01 ~]$ cm_ctl start
cm_ctl: starting the ETCD cluster.
.
cm_ctl: the ETCD cluster starts successfully.
cm_ctl: checking cluster status.
cm_ctl: checking cluster status.
cm_ctl: checking finished in 4882 ms.
cm_ctl: start cluster.
cm_ctl: start nodeid: 1
cm_ctl: start nodeid: 2
cm_ctl: start nodeid: 3
...........
cm_ctl: start cluster successfully.
可以看到集群启动成功。
观看节点进程的状态。
--再次观察节点状态:依然有三个DN处于pending状态。
[root@gaussdb01 dn]# ps -ef |grep omm |grep datanode
omm 252343 1 7 22:35 ? 00:04:16 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6009 -M pending
omm 260172 1 5 22:37 ? 00:03:08 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6001 -M standby
omm 267743 1 10 22:40 ? 00:05:24 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6005 -M standby
[root@gaussdb01 dn]#
[root@gaussdb02 dn]# ps -ef |grep omm |grep datanode
omm 216054 1 5 22:38 ? 00:02:41 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6008 -M standby
omm 217523 1 11 22:39 ? 00:06:10 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6004 -M standby
omm 217734 1 5 22:39 ? 00:02:38 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6002 -M standby
[root@gaussdb03 dn]# ps -ef |grep omm |grep datanode
omm 409968 1 12 23:29 ? 00:00:35 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6007 -M pending
omm 411529 1 15 23:30 ? 00:00:36 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6006 -M pending
omm 413614 1 5 23:30 ? 00:00:11 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6003 -M standby
重启集群:
cm_ctl stop
cm_ctl start
[root@gaussdb01 dn]# ps -ef |grep omm |grep datanode
omm 492859 1 8 23:50 ? 00:00:21 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6001 -M pending
omm 492890 1 9 23:50 ? 00:00:22 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6005 -M pending
omm 492924 1 15 23:50 ? 00:00:38 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6009 -M pending
[root@gaussdb02 dn]# ps -ef |grep omm |grep datanode
omm 439842 1 4 23:50 ? 00:00:12 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6002 -M pending
omm 439871 1 5 23:50 ? 00:00:13 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6004 -M pending
omm 439907 1 20 23:50 ? 00:00:52 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6008 -M pending
[omm@gaussdb03 ~]$ ps -ef |grep omm |grep datanode
omm 452166 1 5 23:50 ? 00:00:13 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6003 -M pending
omm 452193 1 4 23:50 ? 00:00:13 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6006 -M pending
omm 452221 1 20 23:50 ? 00:00:55 /data/cluster/core/app/bin/gaussdb --datanode -D /data/cluster/data/dn/dn_6007 -M pending
omm 459827 456298 0 23:54 pts/0 00:00:00 grep datanode
进程的状态,有的是standby ,有的pending;
我们再用另外一个命令查看集群的状态。
[omm@gaussdb01 ~]$ cm_ctl query -Cvipd
[ CMServer State ]
node node_ip instance state
----------------------------------------------------------------------------
1 192.168.1.52 192.168.1.52 1 /data/cluster/data/cm/cm_server Standby
2 192.168.1.53 192.168.1.53 2 /data/cluster/data/cm/cm_server Primary
3 192.168.1.54 192.168.1.54 3 /data/cluster/data/cm/cm_server Standby
[ ETCD State ]
node node_ip instance state
--------------------------------------------------------------------------
1 192.168.1.52 192.168.1.52 7001 /data/cluster/data/etcd StateLeader
2 192.168.1.53 192.168.1.53 7002 /data/cluster/data/etcd StateFollower
3 192.168.1.54 192.168.1.54 7003 /data/cluster/data/etcd StateFollower
[ Cluster State ]
cluster_state : Normal
redistributing : No
balanced : No
current_az : AZ_ALL
[ Coordinator State ]
node node_ip instance state
------------------------------------------------------------------
1 192.168.1.52 192.168.2.52 5001 8000 /data/cluster/data/cn Normal
2 192.168.1.53 192.168.2.53 5002 8000 /data/cluster/data/cn Normal
3 192.168.1.54 192.168.2.54 5003 8000 /data/cluster/data/cn Normal
[ Central Coordinator State ]
node node_ip instance state
------------------------------------------------------------------
1 192.168.1.52 192.168.2.52 5001 /data/cluster/data/cn Normal
[ GTM State ]
node node_ip instance state sync_state
---------------------------------------------------------------------------------------------------
1 192.168.1.52 192.168.2.52 1001 /data/cluster/data/gtm P Primary Connection ok Sync
2 192.168.1.53 192.168.2.53 1002 /data/cluster/data/gtm S Standby Connection ok Sync
3 192.168.1.54 192.168.2.54 1003 /data/cluster/data/gtm S Standby Connection ok Sync
[ Datanode State ]
node node_ip instance state | node node_ip instance state | node node_ip instance state
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 192.168.1.52 192.168.2.52 6001 33100 /data/cluster/data/dn/dn_6001 P Primary Normal | 2 192.168.1.53 192.168.2.53 6002 33100 /data/cluster/data/dn/dn_6002 S Standby Normal | 3 192.168.1.54 192.168.2.54 6003 33100 /data/cluster/data/dn/dn_6003 S Standby Normal
2 192.168.1.53 192.168.2.53 6004 33120 /data/cluster/data/dn/dn_6004 P Standby Normal | 1 192.168.1.52 192.168.2.52 6005 33120 /data/cluster/data/dn/dn_6005 S Primary Normal | 3 192.168.1.54 192.168.2.54 6006 33120 /data/cluster/data/dn/dn_6006 S Standby Normal
3 192.168.1.54 192.168.2.54 6007 33140 /data/cluster/data/dn/dn_6007 P Standby Normal | 2 192.168.1.53 192.168.2.53 6008 33140 /data/cluster/data/dn/dn_6008 S Standby Normal | 1 192.168.1.52 192.168.2.52 6009 33140 /data/cluster/data/dn/dn_6009 S Primary Normal
可以看到主备都很正常。从这里也可以看出来,使用ps 命令查看到进程的状态是pending,不一定该节点就有问题。所以这里的状态应该算是个bug;
查看集群状态,我们需要使用:cm_ctl query -Cvipd 命令进行查看。
8.登陆数据库测试
[omm@gaussdb01 ~]$ gsql -d postgres -U root -p 8000 -W Gauss_123
gsql ((GaussDB Kernel 503.1.0.SPC1200 build c28d95e9) compiled at 2023-07-05 16:26:07 commit 5703 last mr 11933 release)
Non-SSL connection (SSL connection is recommended when requiring high-security)
Type "help" for help.
gaussdb=> create database test;
CREATE DATABASE
gaussdb=> \c test
Password for user root:
Non-SSL connection (SSL connection is recommended when requiring high-security)
You are now connected to database "test" as user "root".
test=>
test=> create table test1(id int,name varchar(20));
NOTICE: The 'DISTRIBUTE BY' clause is not specified. Using 'id' as the distribution column by default.
HINT: Please use 'DISTRIBUTE BY' clause to specify suitable data distribution column.
CREATE TABLE
test=> insert into test1 values(1,'xsq'),(2,'薛双奇');
INSERT 0 2
test=> select * from test1;
id | name
----+--------
1 | xsq
2 | 薛双奇
(2 rows)
可以看到数据库能够正常使用。
9.总结
(1)经过多次尝试,我们发现当部署命令启动后,三个节点的dn都启动了,并且部署命令(python3 gaussdb_install.py --action main) 尚未运行结束时,执行:
gs_guc set -Z coordinator -Z datanode -N all -I all -c "shared_buffers = 1024MB"
gs_guc set -Z coordinator -Z datanode -N all -I all -c "max_process_memory = 3GB"
执行如上两个命令能够有效方式,在部署结束后集群启动失败,600s超时的问题。执行如上两个语句后,集群部署的最后提示Installation successfully ;而不是failed;
(2)分布式数据库的部署,尽量使用数据库自带的配置文件进行修改,手工修改容易出错,导致部署失败。
通过经过测试发现,每次GaussDB分布式数据库部署,大概需要1个多小。非常慢。
更多推荐
所有评论(0)