准备工作

部署方案

Load Blancer + ReplicatedMergeTree + Distributed + zookeeper

2 shards 2 replicas

机器信息

主机名IP分片副本
clickhouse1192.168.0.13shard1replica1
clickhouse2192.168.0.14shard1replica2
clickhouse3192.168.0.15shard2replica1
clickhouse4192.168.0.100shard2replica2

规划4个节点, 2个分片, 每个分片2个副本。 分片1的副本在主机clickhouse1和clickhouse2上, 2分片的副本在主机clickhouse3和clickhouse4上。

官方建议zookeeper集群与clickhouse集群分开部署,避免资源竞争导致服务异常。

这里我们将zookeeper汲取 部署到k8s上,部署教程为:

clickhouse集群我们采用docker部署

部署clickhouse集群

获取配置文件

  1. 按照官方教程,启动clickhouse-server
    docker run -d --name clickhouse-server --ulimit nofile=262144:262144 --volume=/data/clickhouse/:/var/lib/clickhouse yandex/clickhouse-server
    
  2. 容器启动之后,复制容器内的配置文件到本地
    mkdir -p /etc/clickhouse-server
    docker cp clickhouse-server:/etc/clickhouse-server/ /etc/
    

修改/etc/clickhouse-server/config.xml

  1. 找到下述配置,打开注释并进行修改

    <listen_host>::1</listen_host>
    <listen_host>0.0.0.0</listen_host>
    <listen_host>127.0.0.1</listen_host>
    
  2. 找到metrika.xml位置,修改include from节点为实际引用到的文件

     <!-- If element has 'incl' attribute, then for it's value will be used corresponding substitution from another file.
             By default, path to file with substitutions is /etc/metrika.xml. It could be changed in config in 'include_from' element.
             Values for substitutions are specified in /yandex/name_of_substitution elements in that file.
          -->
     <include_from>/etc/clickhouse-server/metrika.xml</include_from>
    
  3. 增加分片与副本信息

        <!-- Configuration of clusters that could be used in Distributed tables.
             https://clickhouse.com/docs/en/operations/table_engines/distributed/
          -->
        <remote_servers>
            <!-- Test only shard config for testing distributed storage -->
            <cluster_2s_2r>
                <!-- 数据分片1  -->
                <shard>
                    <internal_replication>true</internal_replication>
                    <replica>
                        <host>192.168.0.13</host>
                        <port>9000</port>
                        <user>default</user>
                        <password></password>
                    </replica>
                    <replica>
                        <host>192.168.0.14</host>
                        <port>9000</port>
                        <user>default</user>
                        <password></password>
                    </replica>
                </shard>
    
                <!-- 数据分片2  -->
                <shard>
                    <internal_replication>true</internal_replication>
                    <replica>
                        <host>192.168.0.15</host>
                        <port>9000</port>
                        <user>default</user>
                        <password></password>
                    </replica>
                    <replica>
                        <host>192.168.0.100</host>
                        <port>9000</port>
                        <user>default</user>
                        <password></password>
                    </replica>
                </shard>
            </cluster_2s_2r>
        </remote_servers>
    
  4. 将这些文件同步到其他机器上

增加metrika.xml文件到/etc/clickhouse-server/

metrika.xml文件主要用来配置分片及副本的数目和机器的匹配情况,每台机器的配置都不一样,具体如下

<!--所有实例均使用这个集群配置,不用个性化 -->
<yandex>

    <!-- 集群配置 -->
    <!-- clickhouse_remote_servers所有实例配置都一样 -->
    <!-- 集群配置 -->
      <clickhouse_remote_servers>
        <cluster_2s_2r>

            <!-- 数据分片1  -->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>192.168.0.13</host>
                    <port>9000</port>
                    <user>default</user>
                    <password></password>
                </replica>
		<replica>
                    <host>192.168.0.14</host>
                    <port>9000</port>
                    <user>default</user>
                    <password></password>
                </replica>
            </shard>

            <!-- 数据分片2  -->
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>192.168.0.15</host>
                    <port>9000</port>
                    <user>default</user>
                    <password></password>
                </replica>
		<replica>
                    <host>192.168.0.100</host>
                    <port>9000</port>
                    <user>default</user>
                    <password></password>
                </replica>
            </shard>
        </cluster_2s_2r>
    </clickhouse_remote_servers>

    <!-- ZK  -->
    <!-- zookeeper_servers所有实例配置都一样 -->
    <zookeeper>
        <node>
            <host>192.168.20.35</host>
            <port>2181</port>
        </node>
    </zookeeper>
    
    <!-- marcos每个实例配置不一样 分片1, 副本1 -->
    <macros>
        <layer>01</layer>
        <shard>01</shard>
        <replica>192.168.0.13</replica>
    </macros>


    <networks>
        <ip>::/0</ip>
    </networks>

    <!-- 数据压缩算法  -->
    <clickhouse_compression>
        <case>
            <min_part_size>10000000000</min_part_size>
            <min_part_size_ratio>0.01</min_part_size_ratio>
            <method>lz4</method>
        </case>
    </clickhouse_compression>

</yandex>

这里我们只需修改每个metrika.xml配置文件中的宏(macros)即可

clickhouse1 (1 shard, 1 replica)

用于存储1分片的数据

<!--所有实例均使用这个集群配置,不用个性化 -->
<yandex>

    <!-- 集群配置 -->
    <!-- clickhouse_remote_servers所有实例配置都一样 -->
    ...

    <!-- ZK  -->
    <!-- zookeeper_servers所有实例配置都一样 -->
    ...
    
    <!-- marcos每个实例配置不一样 分片1, 副本1 -->
    <macros>
        <layer>01</layer>
        <shard>01</shard>
        <replica>192.168.0.13</replica>
    </macros>


    <networks>
        <ip>::/0</ip>
    </networks>

    <!-- 数据压缩算法  -->
    ...

</yandex>

clickhouse2 (1 shard, 2 replica)

用于存储1分片的数据备份, 与clickhouse1的数据相同

<!--所有实例均使用这个集群配置,不用个性化 -->
<yandex>

    <!-- 集群配置 -->
    <!-- clickhouse_remote_servers所有实例配置都一样 -->
    ...

    <!-- ZK  -->
    <!-- zookeeper_servers所有实例配置都一样 -->
    ...
    
    <!-- marcos每个实例配置不一样 分片1, 副本1 -->
    <macros>
        <layer>01</layer>
        <shard>01</shard>
        <replica>192.168.0.14</replica>
    </macros>


    <networks>
        <ip>::/0</ip>
    </networks>

    <!-- 数据压缩算法  -->
    ...

</yandex>

clickhouse3 (2 shard, 1 replica)

用于存储2分片的数据

<!--所有实例均使用这个集群配置,不用个性化 -->
<yandex>

    <!-- 集群配置 -->
    <!-- clickhouse_remote_servers所有实例配置都一样 -->
    ...

    <!-- ZK  -->
    <!-- zookeeper_servers所有实例配置都一样 -->
    ...
    
    <!-- marcos每个实例配置不一样 分片1, 副本1 -->
    <macros>
        <layer>01</layer>
        <shard>02</shard>
        <replica>192.168.0.15</replica>
    </macros>


    <networks>
        <ip>::/0</ip>
    </networks>

    <!-- 数据压缩算法  -->
    ...

</yandex>

clickhouse4 (2 shard, 2 replica)

用于存储2分片的数据备份, 与clickhouse3的数据相同

<!--所有实例均使用这个集群配置,不用个性化 -->
<yandex>

    <!-- 集群配置 -->
    <!-- clickhouse_remote_servers所有实例配置都一样 -->
    ...

    <!-- ZK  -->
    <!-- zookeeper_servers所有实例配置都一样 -->
    ...
    
    <!-- marcos每个实例配置不一样 分片1, 副本1 -->
    <macros>
        <layer>01</layer>
        <shard>01</shard>
        <replica>192.168.0.100</replica>
    </macros>


    <networks>
        <ip>::/0</ip>
    </networks>

    <!-- 数据压缩算法  -->
    ...

</yandex>

启动服务

docker run -d \
--name clickhouse \
--ulimit nofile=262144:262144 \
--volume=/data/clickhouse:/var/lib/clickhouse \
--volume=/etc/clickhouse-server/:/etc/clickhouse-server/ \
--add-host clickhouse1:192.168.0.13 \
--add-host clickhouse2:192.168.0.14 \
--add-host clickhouse3:192.168.0.15 \
--add-host clickhouse4:192.168.0.100 \
--hostname $(hostname) \
-p 9000:9000 \
-p 8123:8123 \
-p 9009:9009 \
mirror.corp.wuyacapital.com/common/clickhouse-server

集群验证

查看集群

使用clickhouse-client登录到集群,并查看集群

# 登录集群
docker run -it  --rm  --add-host clickhouse1:192.168.0.13 --add-host clickhouse2:192.168.0.14 --add-host clickhouse3:192.168.0.15 --add-host clickhouse4:192.168.0.100  yandex/clickhouse-client  --host clickhouse1 --port 9000
# 登录成功
ClickHouse client version 22.1.3.7 (official build).
Connecting to clickhouse1:9000 as user default.
Connected to ClickHouse server version 22.1.3 revision 54455.

clickhouse1 :)

查看集群

clickhouse1 :)select * from system.clusters

SELECT *
FROM system.clusters

Query id: 9f13df9c-862f-44f1-8abd-bf47cdb4eb9c

┌─cluster───────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name───┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─slowdowns_count─┬─estimated_recovery_time─┐
│ cluster_2s_2r │         111192.168.0.13  │ 192.168.0.13   │ 90000 │ default │                  │            000 │
│ cluster_2s_2r │         112192.168.0.14  │ 192.168.0.14   │ 90000 │ default │                  │            000 │
│ cluster_2s_2r │         211192.168.0.15  │ 192.168.0.15   │ 90000 │ default │                  │            000 │
│ cluster_2s_2r │         212192.168.0.100 │ 192.168.0.100  │ 90000 │ default │                  │            000 │
└───────────────┴───────────┴──────────────┴─────────────┴─────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────┴─────────────────────────┘

4 rows in set. Elapsed: 0.005 sec.

可以看到集群创建成功,分片与副本设置正确

安装参考

https://blog.51cto.com/u_14900374/2629096

https://newsqlgroup.com/t/clickhouse/65

https://blog.csdn.net/tototuzuoquan/article/details/111305125

https://zhuanlan.zhihu.com/p/343786164

https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replacingmergetree

Logo

腾讯云面向开发者汇聚海量精品云计算使用和开发经验,营造开放的云计算技术生态圈。

更多推荐