组件 节点角色 IP 地址 端口 说明
ZooKeeper 协调节点 192.168.0.121 2181 管理元数据与复制队列 (ClickHouse 依赖)
ClickHouse 主节点 (Master) 192.168.0.121 8123/9000 副本 1 (ck-master)
ClickHouse 从节点 (Slave) 192.168.0.116 8123/9000 副本 2 (ck-slave)

第一步:部署 ZooKeeper (仅在 192.168.0.121 执行)
ClickHouse 的分布式表和数据复制强依赖 ZooKeeper。我们首先在主节点启动 ZK 服务。

# 1. 创建数据持久化目录
mkdir -p /data/zookeeper/data
mkdir -p /data/zookeeper/log
chmod 777 /data/zookeeper/data /data/zookeeper/log

# 2. 清理可能存在的旧容器 (避免名称冲突)
docker rm -f zookeeper 2>/dev/null || true

# 3. 启动 ZooKeeper 容器
docker run -d \
  --name zookeeper \
  --restart=always \
  --net=host \
  -e ZOOKEEPER_CLIENT_PORT=2181 \
  -e ZOOKEEPER_TICK_TIME=2000 \
  -v /data/zookeeper/data:/var/lib/zookeeper/data \
  -v /data/zookeeper/log:/var/lib/zookeeper/log \
  confluentinc/cp-zookeeper:7.5.0

# 4. 验证启动状态
echo ">>> 检查容器状态..."
docker ps | grep zookeeper

echo ">>> 测试端口连通性 (应返回 imok)..."
echo "ruok" | nc 127.0.0.1 2181 || echo "nc 命令不存在,跳过此步,直接看日志"
docker logs zookeeper --tail 5

第二步:配置 ClickHouse 集群 (双节点均需执行)
我们需要修改 ClickHouse 的 config.xml,配置 ZooKeeper 地址、集群拓扑以及唯一的节点宏 (Macros)。

# 备份原配置
cp /data/clickhouse/config/config.xml /data/clickhouse/config/config.xml.bak

# 写入新配置 (注意 macros 中的 replica 名称为 ck-master)
cat > /data/clickhouse/config/config.xml << 'EOF'
<?xml version="1.0"?>
<clickhouse>
    <logger>
        <level>information</level>
        <log>/var/log/clickhouse-server/clickhouse-server.log</log>
        <errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
        <size>100M</size>
        <count>10</count>
        <console>true</console>
    </logger>

    <http_port>8123</http_port>
    <tcp_port>9000</tcp_port>
    <interserver_http_port>9009</interserver_http_port>
    <listen_host>0.0.0.0</listen_host>

    <max_connections>4096</max_connections>
    <keep_alive_timeout>3</keep_alive_timeout>
    <max_concurrent_queries>100</max_concurrent_queries>

    <path>/var/lib/clickhouse/</path>
    <tmp_path>/var/lib/clickhouse/tmp/</tmp_path>
    <user_files_path>/var/lib/clickhouse/user_files/</user_files_path>
    <format_schema_path>/var/lib/clickhouse/format_schemas/</format_schema_path>

    <default_profile>default</default_profile>
    <default_database>default</default_database>
    <timezone>Asia/Shanghai</timezone>

    <!-- ZooKeeper 配置 -->
    <zookeeper>
        <node index="1">
            <host>192.168.0.121</host>
            <port>2181</port>
        </node>
    </zookeeper>

    <!-- 分布式 DDL 队列 -->
    <distributed_ddl>
        <path>/clickhouse/task_queue/ddl</path>
    </distributed_ddl>

    <!-- 集群定义 -->
    <remote_servers>
        <ck_cluster>
            <shard>
                <replica>
                    <host>192.168.0.121</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>192.168.0.116</host>
                    <port>9000</port>
                </replica>
            </shard>
        </ck_cluster>
    </remote_servers>

    <!-- 宏配置:主节点唯一标识 -->
    <macros>
        <shard>01</shard>
        <replica>ck-master</replica>
    </macros>

    <mark_cache_size>5368709120</mark_cache_size>
</clickhouse>
EOF

# 重启 ClickHouse
docker restart clickhouse
echo ">>> 主节点配置完成,等待启动..."
sleep 5
docker logs clickhouse --tail 5 | grep -i "zookeeper\|ready"

在【从节点】(192.168.0.116) 执行

# 备份原配置
cp /data/clickhouse/config/config.xml /data/clickhouse/config/config.xml.bak

# 写入新配置 (注意 macros 中的 replica 名称必须不同,设为 ck-slave)
cat > /data/clickhouse/config/config.xml << 'EOF'
<?xml version="1.0"?>
<clickhouse>
    <logger>
        <level>information</level>
        <log>/var/log/clickhouse-server/clickhouse-server.log</log>
        <errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
        <size>100M</size>
        <count>10</count>
        <console>true</console>
    </logger>

    <http_port>8123</http_port>
    <tcp_port>9000</tcp_port>
    <interserver_http_port>9009</interserver_http_port>
    <listen_host>0.0.0.0</listen_host>

    <max_connections>4096</max_connections>
    <keep_alive_timeout>3</keep_alive_timeout>
    <max_concurrent_queries>100</max_concurrent_queries>

    <path>/var/lib/clickhouse/</path>
    <tmp_path>/var/lib/clickhouse/tmp/</tmp_path>
    <user_files_path>/var/lib/clickhouse/user_files/</user_files_path>
    <format_schema_path>/var/lib/clickhouse/format_schemas/</format_schema_path>

    <default_profile>default</default_profile>
    <default_database>default</default_database>
    <timezone>Asia/Shanghai</timezone>

    <!-- ZooKeeper 配置 (指向主节点的 ZK) -->
    <zookeeper>
        <node index="1">
            <host>192.168.0.121</host>
            <port>2181</port>
        </node>
    </zookeeper>

    <!-- 分布式 DDL 队列 -->
    <distributed_ddl>
        <path>/clickhouse/task_queue/ddl</path>
    </distributed_ddl>

    <!-- 集群定义 -->
    <remote_servers>
        <ck_cluster>
            <shard>
                <replica>
                    <host>192.168.0.121</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>192.168.0.116</host>
                    <port>9000</port>
                </replica>
            </shard>
        </ck_cluster>
    </remote_servers>

    <!-- 宏配置:从节点唯一标识 -->
    <macros>
        <shard>01</shard>
        <replica>ck-slave</replica>
    </macros>

    <mark_cache_size>5368709120</mark_cache_size>
</clickhouse>
EOF

# 重启 ClickHouse
docker restart clickhouse
echo ">>> 从节点配置完成,等待启动..."
sleep 5
docker logs clickhouse --tail 5 | grep -i "zookeeper\|ready"

第三步:验证集群与数据同步

这是验证部署是否成功的“高光时刻”。我们将创建一个复制表,并在主节点写入数据,在从节点读取。
1. 创建测试数据库与表 (在主节点执行)

# 在 192.168.0.121 执行
echo ">>> 正在创建分布式数据库和复制表..."

docker exec -it clickhouse clickhouse-client --query "
CREATE DATABASE IF NOT EXISTS test_db ON CLUSTER ck_cluster;

CREATE TABLE IF NOT EXISTS test_db.user_events ON CLUSTER ck_cluster
(
    event_time DateTime,
    user_id UInt64,
    event_type String,
    event_data String
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/user_events', '{replica}')
ORDER BY (event_time, user_id);
"

echo ">>> 建表完成。"

2. 插入测试数据 (在主节点执行)

# 在 192.168.0.121 执行
echo ">>> 正在主节点插入测试数据..."

docker exec -it clickhouse clickhouse-client --query "
INSERT INTO test_db.user_events VALUES 
(now(), 1001, 'login', 'User 1001 logged in from PC'),
(now(), 1002, 'click', 'User 1002 clicked button A'),
(now(), 1003, 'purchase', 'User 1003 bought item X')
"

echo ">>> 数据写入成功。"

3. 验证数据同步 (在从节点执行)
请切换到 192.168.0.116 (从节点) 执行以下命令:

# 在 192.168.0.116 执行
echo ">>> 正在从节点查询数据 (验证同步)..."

docker exec -it clickhouse clickhouse-client --query "
SELECT event_time, user_id, event_type, event_data 
FROM test_db.user_events 
ORDER BY event_time
FORMAT Pretty
"

Logo

腾讯云面向开发者汇聚海量精品云计算使用和开发经验,营造开放的云计算技术生态圈。

更多推荐