ClickHouse 实例 ck-ec818311 配置持久化详细分析
环境概述
| 项目 |
值 |
| 实例名称 |
ck-ec818311 |
| 命名空间 |
qfusion-admin |
| ClickHouse版本 |
24.8.7.41 |
| Operator版本 |
Altinity v0.18.5 (commit: 059f622) |
| 当前状态 |
Running (2副本) |
一、实例架构概览
1.1 集群拓扑
ck-ec818311 ClickHouse集群
│
┌───────────┴───────────┐
│ │
┌───────▼───────┐ ┌───────▼───────┐
│ Shard 0 │ │ ZooKeeper │
│ (1分片) │ │ (3节点) │
└───────┬───────┘ └───────┬───────┘
│ │
┌───────────┴───────────┐ │
│ │ │
┌────▼────┐ ┌────▼────┐ │
│Replica 0│ │Replica 1│ │
│qfusion2 │ │qfusion1 │ │
└────┬────┘ └────┬────┘ │
│ │ │
245.0.0.254 245.0.4.12 │
│ │ │
└───────────────────────┴───────────┘
│
Cluster Services
1.2 组件清单
ClickHouse 副本
| Pod名称 |
节点 |
IP |
状态 |
Age |
| ck-ec818311-replica0-0-0 |
qfusion2 |
245.0.0.254 |
Running (2/2) |
23m |
| ck-ec818311-replica0-1-0 |
qfusion1 |
245.0.4.12 |
Running (2/2) |
22m |
容器组成:
clickhouse: ClickHouse服务主容器
slowlog: 慢查询日志收集器
ZooKeeper 集群
| StatefulSet |
节点 |
Age |
| ck-ec818311-zk-0 |
qfusion2 |
23m |
| ck-ec818311-zk-1 |
qfusion3 |
22m |
| ck-ec818311-zk-2 |
qfusion1 |
21m |
Kubernetes Services
| Service名称 |
类型 |
ClusterIP |
端口 |
用途 |
| ck-ec8183110 |
ClusterIP |
246.100.135.25 |
8123, 9000 |
集群入口 |
| ck-ec818311-shard0 |
ClusterIP |
246.108.128.9 |
8123, 9000 |
分片服务 |
| ck-ec818311-replica0-0 |
ClusterIP |
246.96.254.131 |
8123, 9000, 22 |
副本0 |
| ck-ec818311-replica0-1 |
ClusterIP |
246.109.39.110 |
8123, 9000, 22 |
副本1 |
| ck-ec818311-zk-client |
ClusterIP |
246.102.224.177 |
2181 |
ZK客户端 |
| ck-ec818311-zk-headless |
ClusterIP |
None |
2181, 2888, 3888 |
ZK内部通信 |
二、CHI (ClickHouseInstallation) 配置详解
2.1 配置文件结构
apiVersion: clickhouse.altinity.com/v1
kind: ClickHouseInstallation
metadata:
name: ck-ec818311
namespace: qfusion-admin
labels:
AppName: ck-ec818311
DBRole: Clickhouse
DBType: ck
Type: Database
spec:
configuration:
settings: {}
profiles: {}
users: {}
files: {}
zookeeper: {}
clusters: []
templates:
podTemplates: []
serviceTemplates: []
volumeClaimTemplates: []
defaults:
templates: {}
2.2 基础设置 (configuration.settings)
settings:
http_port: "8123"
tcp_port: "9000"
interserver_http_port: "9010"
timezone: Asia/Shanghai
生成的配置文件: config.d/chop-generated-settings.xml
<yandex>
<http_port>8123</http_port>
<interserver_http_port>9010</interserver_http_port>
<tcp_port>9000</tcp_port>
<timezone>Asia/Shanghai</timezone>
</yandex>
2.3 内存配置
方式1: 通过 files 字段注入自定义配置文件
configuration:
files:
memory.xml: |
<yandex>
<max_server_memory_usage>3006477108</max_server_memory_usage>
</yandex>
方式2: 通过 profiles 字段配置
configuration:
profiles:
default/max_memory_usage: "3006477108"
default/max_bytes_before_external_group_by: "3006477108"
配置挂载路径: /etc/clickhouse-server/config.d/memory.xml
2.4 集群拓扑配置 (configuration.clusters)
configuration:
clusters:
- name: ck-ec818311
layout:
shardsCount: 1
replicasCount: 2
生成的远程服务器配置: config.d/chop-generated-remote_servers.xml
<yandex>
<remote_servers>
<ck-ec818311>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>ck-ec818311-replica0-0</host>
<port>9000</port>
<user>rdsadmin</user>
<password>xxx</password>
</replica>
<replica>
<host>ck-ec818311-replica0-1</host>
<port>9000</port>
<user>rdsadmin</user>
<password>xxx</password>
</replica>
</shard>
</ck-ec818311>
</remote_servers>
</yandex>
2.5 ZooKeeper 配置
configuration:
zookeeper:
nodes:
- host: ck-ec818311-zk-0-0.ck-ec818311-zk-headless.qfusion-admin
port: 2181
- host: ck-ec818311-zk-1-0.ck-ec818311-zk-headless.qfusion-admin
port: 2181
- host: ck-ec818311-zk-2-0.ck-ec818311-zk-headless.qfusion-admin
port: 2181
session_timeout_ms: 30000
operation_timeout_ms: 10000
生成的配置: conf.d/chop-generated-zookeeper.xml
<yandex>
<zookeeper>
<node>
<host>ck-ec818311-zk-0-0.ck-ec818311-zk-headless.qfusion-admin</host>
<port>2181</port>
</node>
<node>
<host>ck-ec818311-zk-1-0.ck-ec818311-zk-headless.qfusion-admin</host>
<port>2181</port>
</node>
<node>
<host>ck-ec818311-zk-2-0.ck-ec818311-zk-headless.qfusion-admin</host>
<port>2181</port>
</node>
<session_timeout_ms>30000</session_timeout_ms>
<operation_timeout_ms>10000</operation_timeout_ms>
</zookeeper>
<distributed_ddl>
<path>/clickhouse/ck-ec818311/task_queue/ddl</path>
</distributed_ddl>
</yandex>
2.6 用户配置
rdsadmin 管理员用户
配置文件: users.d/01-clickhouse-rdsadmin-profile.xml
<yandex>
<users>
<rdsadmin>
<networks>
<ip>127.0.0.1</ip>
<ip>0.0.0.0/0</ip>
<ip>::/0</ip>
</networks>
<password_sha256_hex>f954deb4f118e86dfd9da96e5dfb20296a502db175bf74a0a8fb6c36d85af4b2</password_sha256_hex>
<profile>rdsadmin</profile>
<quota>default</quota>
<access_management>1</access_management>
<show_named_collections>1</show_named_collections>
<named_collection_control>1</named_collection_control>
<show_named_collections_secrets>1</show_named_collections_secrets>
</rdsadmin>
</users>
<profiles>
<rdsadmin>
<log_queries>0</log_queries>
<skip_unavailable_shards>1</skip_unavailable_shards>
<http_connection_timeout>10</http_connection_timeout>
</rdsadmin>
</profiles>
</yandex>
Secret存储: chi-ck-ec818311-admin-user-secret
data:
username: cmRzYWRtaW4=
password: eTVDWjhaWVE=
default 用户配置
配置文件: users.d/chop-generated-users.xml
<yandex>
<users>
<default>
<networks>
<ip>::1</ip>
<ip>127.0.0.1</ip>
</networks>
<profile>default</profile>
<quota>default</quota>
</default>
<rdsadmin>
<networks>
<ip>::1</ip>
<ip>127.0.0.1</ip>
</networks>
<password_sha256_hex>f954deb4f118e86dfd9da96e5dfb20296a502db175bf74a0a8fb6c36d85af4b2</password_sha256_hex>
<profile>rdsadmin</profile>
<quota>default</quota>
</rdsadmin>
</users>
</yandex>
2.7 Pod 模板配置 (templates.podTemplates)
spec:
templates:
podTemplates:
- name: pod-template-ck-ec818311-0
podDistribution:
- type: ClickHouseAntiAffinity
spec:
containers:
- name: clickhouse
image: k8smaster.qfusion.irds/irds/clickhouse-server:24.8.7.41
env:
- name: MASTER_HOST
value: ck-ec818311-replica0-0
resources:
limits:
cpu: "1"
memory: 4Gi
requests:
cpu: 500m
memory: 4Gi
ports:
- containerPort: 8123
name: http
- containerPort: 9000
name: tcp
- containerPort: 9010
name: interserver
volumeMounts:
- mountPath: /var/lib/clickhouse/
name: data
- mountPath: /var/log/clickhouse-server
name: logs
- mountPath: /etc/clickhouse-server/config.d/
name: chi-ck-ec818311-common-configd
- mountPath: /etc/clickhouse-server/users.d/
name: chi-ck-ec818311-common-usersd
- mountPath: /etc/clickhouse-server/conf.d/
name: chi-ck-ec818311-deploy-confd-ck-ec818311-0-0
2.8 Service 模板配置
serviceTemplates:
- name: ck-ec818311-replica-svc
spec:
type: ClusterIP
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10800
ports:
- name: http
port: 8123
targetPort: 8123
- name: tcp
port: 9000
targetPort: 9000
- name: ssh
port: 22
targetPort: 22
三、参数持久化机制
3.1 ConfigMap 持久化架构
ClickHouse Operator 使用 ConfigMap 作为配置持久化的核心机制。
3.1.1 ConfigMap 列表
| ConfigMap名称 |
用途 |
挂载路径 |
数据项 |
chi-ck-ec818311-common-configd |
通用配置 |
/etc/clickhouse-server/config.d/ |
8项 |
chi-ck-ec818311-common-usersd |
用户配置 |
/etc/clickhouse-server/users.d/ |
4项 |
chi-ck-ec818311-deploy-confd-ck-ec818311-0-0 |
Pod专属配置 |
/etc/clickhouse-server/conf.d/ |
3项 |
chi-ck-ec818311-deploy-confd-ck-ec818311-0-1 |
Pod专属配置 |
/etc/clickhouse-server/conf.d/ |
3项 |
chi-ck-ec818311-backup-config |
备份配置 |
/etc/clickhouse-backup/ |
1项 |
3.1.2 ConfigMap 详细内容
chi-ck-ec818311-common-configd (8项配置文件):
| 文件名 |
内容说明 |
01-clickhouse-01-listen.xml |
监听地址配置 (::, 0.0.0.0) |
01-clickhouse-02-logger.xml |
日志配置 (级别: error, 路径: /var/log/clickhouse-server/) |
01-clickhouse-03-query_log.xml |
查询日志 (system.query_log, TTL 30天) |
01-clickhouse-04-part_log.xml |
分区日志 (system.part_log, TTL 30天) |
02-clickhouse-01-qfusion.xml |
QFusion平台定制配置 |
chop-generated-remote_servers.xml |
集群拓扑配置 |
chop-generated-settings.xml |
基础设置 (端口、时区) |
memory.xml |
内存限制配置 |
chi-ck-ec818311-deploy-confd-ck-ec818311-0-0 (Pod专属配置):
| 文件名 |
内容说明 |
chop-generated-macros.xml |
宏变量定义 (cluster, shard, replica) |
chop-generated-ports.xml |
端口配置 |
chop-generated-zookeeper.xml |
ZooKeeper连接配置 |
3.1.3 ConfigMap 生命周期管理
metadata:
ownerReferences:
- apiVersion: clickhouse.altinity.com/v1
blockOwnerDeletion: true
controller: true
kind: ClickHouseInstallation
name: ck-ec818311
关键特性:
- ConfigMap 通过
ownerReferences 与 CHI 绑定
- 删除 CHI 时,相关 ConfigMap 会自动清理
- Operator 监听 CHI 变化,自动更新 ConfigMap
3.2 配置文件在容器内的挂载结构
/etc/clickhouse-server/
├── config.d/ # 通用配置目录
│ ├── 01-clickhouse-01-listen.xml # 监听配置
│ ├── 01-clickhouse-02-logger.xml # 日志配置
│ ├── 01-clickhouse-03-query_log.xml # 查询日志
│ ├── 01-clickhouse-04-part_log.xml # 分区日志
│ ├── 02-clickhouse-01-qfusion.xml # 平台定制
│ ├── chop-generated-remote_servers.xml # 集群拓扑
│ ├── chop-generated-settings.xml # 基础设置
│ └── memory.xml # 内存限制
│
├── users.d/ # 用户配置目录
│ ├── 01-clickhouse-rdsadmin-profile.xml # 管理员配置
│ ├── 02-clickhouse-default-profile.xml # 默认配置
│ ├── chop-generated-profiles.xml # 性能配置
│ └── chop-generated-users.xml # 用户定义
│
└── conf.d/ # Pod专属配置目录
├── chop-generated-macros.xml # 宏变量
├── chop-generated-ports.xml # 端口配置
└── chop-generated-zookeeper.xml # ZK配置
/etc/clickhouse-backup/
└── config.yaml # 备份配置
/etc/clickhouse-operator/
└── qfusionadmin/ # 管理员凭证目录
3.3 配置更新流程
┌─────────────────┐
│ 用户修改 CHI │
└────────┬────────┘
│
▼
┌─────────────────────────────────┐
│ ClickHouse Operator 监听到变化 │
└────────┬────────────────────────┘
│
▼
┌─────────────────────────────────┐
│ Operator 更新 ConfigMap │
│ (chop-generated-*.xml) │
└────────┬────────────────────────┘
│
▼
┌─────────────────────────────────┐
│ Operator 更新 StatefulSet │
│ (settings-version label变化) │
└────────┬────────────────────────┘
│
▼
┌─────────────────────────────────┐
│ 滚动更新 Pod │
│ (按顺序重启副本) │
└────────┬────────────────────────┘
│
▼
┌─────────────────────────────────┐
│ 新 Pod 加载新配置启动 │
└─────────────────────────────────┘
版本标签机制:
labels:
clickhouse.altinity.com/settings-version: 3aa06eec4c6e577f2f2635e838cfe34e49b66bf0
clickhouse.altinity.com/zookeeper-version: 7292d345b9a7e8016eb5ee3d67a2c6d59ce7e48e
3.4 Secret 持久化 (敏感信息)
| Secret名称 |
用途 |
数据 |
chi-ck-ec818311-admin-user-secret |
管理员凭证 |
username, password |
挂载方式:
volumes:
- name: chi-ck-ec818311-admin-user-secret
secret:
secretName: chi-ck-ec818311-admin-user-secret
volumeMounts:
- mountPath: /etc/clickhouse-operator/qfusionadmin
name: chi-ck-ec818311-admin-user-secret
四、数据持久化机制
4.1 存储架构
┌─────────────────────────────────────────────────────────────┐
│ ClickHouse Pod │
│ ┌─────────────────────────────────────────────────────────┐│
│ │ clickhouse 容器 ││
│ │ ││
│ │ /var/lib/clickhouse/ ◄─────┐ ││
│ │ │ ├── data/ │ ││
│ │ │ ├── metadata/ │ ││
│ │ │ ├── preprocessed_configs/│ ││
│ │ │ └── ... │ ││
│ │ │ │ ││
│ │ /var/log/clickhouse-server/ ───┤ emptyDir (临时) ││
│ │ │ │ ││
│ └───────────────────────────┼────┼──────────────────────────┘│
│ │ │ │
│ ┌───────────────────────────▼────▼──────────────────────────┐│
│ │ Volume Layer ││
│ │ ┌─────────────────┐ ┌─────────────────┐ ││
│ │ │ PVC: data │ │ emptyDir: logs │ ││
│ │ │ 20Gi, RWO │ │ (Pod重启丢失) │ ││
│ │ └────────┬────────┘ └─────────────────┘ ││
│ └───────────┼──────────────────────────────────────────────┘│
│ │ │
└──────────────┼─────────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────┐
│ Kubernetes Storage Layer │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ StorageClass: csi-localpv │ │
│ │ Provisioner: qfusion/localpv │ │
│ │ ReclaimPolicy: Delete │ │
│ │ BindingMode: WaitForFirstConsumer │ │
│ └────────────────────────┬────────────────────────────────┘ │
└───────────────────────────┼───────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────┐
│ Node Local Storage │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ /opt/qfusion/localpv/ │ │
│ │ └── data-ck-ec818311-replica0-0-0-qfusion2-9d6025ef/ │ │
│ │ └── pv/ │ │
│ └─────────────────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────────────┘
4.2 StorageClass 配置
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: csi-localpv
annotations:
sc.woqutech.com/display: "本地存储"
sc.woqutech.com/mount: /opt/qfusion/localpv
sc.woqutech.com/vendor: woqutech
provisioner: qfusion/localpv
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
特性说明:
- 本地存储: 直接使用节点本地磁盘,提供最高性能
- 延迟绑定:
WaitForFirstConsumer 确保Pod调度时考虑存储位置
- 回收策略:
Delete - PVC删除时数据会被清除
- 在线扩容: 支持动态扩容
4.3 PVC 详细配置
| PVC名称 |
容量 |
访问模式 |
存储类 |
绑定节点 |
状态 |
| data-ck-ec818311-replica0-0-0 |
20Gi |
RWO |
csi-localpv |
qfusion2 |
Bound |
| data-ck-ec818311-replica0-1-0 |
20Gi |
RWO |
csi-localpv |
qfusion1 |
Bound |
| data-ck-ec818311-zk-0-0 |
20Gi |
RWO |
csi-localpv |
qfusion2 |
Bound |
| data-ck-ec818311-zk-1-0 |
20Gi |
RWO |
csi-localpv |
qfusion3 |
Bound |
| data-ck-ec818311-zk-2-0 |
20Gi |
RWO |
csi-localpv |
qfusion1 |
Bound |
PVC模板定义:
volumeClaimTemplates:
- metadata:
name: data
labels:
AppName: ck-ec818311
DBType: ck
clickhouse.altinity.com/reclaimPolicy: Delete
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: csi-localpv
4.4 Pod 反亲和性调度
podDistribution:
- type: ClickHouseAntiAffinity
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
clickhouse.altinity.com/app: chop
clickhouse.altinity.com/chi: ck-ec818311
clickhouse.altinity.com/cluster: ck-ec818311
topologyKey: kubernetes.io/hostname
效果: 两个ClickHouse副本分别调度到不同节点 (qfusion2, qfusion1)
4.5 日志持久化策略
| 日志类型 |
存储方式 |
持久化 |
保留策略 |
| ClickHouse错误日志 |
emptyDir |
否 |
Pod重启丢失 |
| ClickHouse查询日志 |
system.query_log表 |
是 (数据目录) |
TTL 30天 |
| ClickHouse分区日志 |
system.part_log表 |
是 (数据目录) |
TTL 30天 |
| 慢查询日志 |
emptyDir |
否 |
由slowlog容器处理 |
五、备份配置
5.1 备份配置 ConfigMap
配置文件: /etc/clickhouse-backup/config.yaml
general:
remote_storage: none
max_file_size: 1073741824
disable_progress_bar: true
backups_to_keep_local: 0
backups_to_keep_remote: 0
log_level: info
allow_empty_backups: false
download_concurrency: 1
upload_concurrency: 1
restore_schema_on_cluster: ""
clickhouse:
username: rdsadmin
password: xxx
host: localhost
port: 9000
disk_mapping: {}
skip_tables:
- system.*
- INFORMATION_SCHEMA.*
- information_schema.*
timeout: 5m
freeze_by_part: false
sync_replicated_tables: true
debug: false
s3:
access_key: ""
secret_key: ""
bucket: ""
endpoint: ""
region: us-east-1
gcs:
credentials_file: ""
bucket: ""
azblob:
account_name: ""
account_key: ""
container: ""
5.2 备份配置挂载
volumes:
- name: chi-ck-ec818311-backup-config
configMap:
name: chi-ck-ec818311-backup-config
volumeMounts:
- mountPath: /etc/clickhouse-backup/
name: chi-ck-ec818311-backup-config
六、健康检查配置
6.1 存活探针 (Liveness Probe)
livenessProbe:
httpGet:
path: /ping
port: http
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 3
timeoutSeconds: 1
successThreshold: 1
failureThreshold: 10
6.2 就绪探针 (Readiness Probe)
readinessProbe:
httpGet:
path: /ping
port: http
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 3
timeoutSeconds: 1
successThreshold: 1
failureThreshold: 3
七、集群内部配置验证
7.1 system.clusters 查询结果
┌─cluster─────┬─shard_num─┬─replica_num─┬─host_num─┬─host──────────────────────┬─port─┬─user────┬─shard─┬─replica───────┐
│ ck-ec818311 │ 1 │ 1 │ 1 │ ck-ec818311-replica0-0 │ 9000 │ rdsadmin│ 0 │ ck-ec818311...│
│ ck-ec818311 │ 1 │ 1 │ 2 │ ck-ec818311-replica0-1 │ 9000 │ rdsadmin│ 0 │ ck-ec818311...│
└─────────────┴───────────┴─────────────┴──────────┴──────────────────────────┴──────┴─────────┴───────┴───────────────┘
说明:
- 1个分片 (shard_num = 1)
- 2个副本 (replica_num = 1, host_num = 1,2)
- 使用
rdsadmin 用户进行副本间认证
八、配置修改操作指南
8.1 修改内存限制
方式1: 通过 CHI spec
kubectl edit chi ck-ec818311 -n qfusion-admin
spec:
configuration:
files:
memory.xml: |
<yandex>
<max_server_memory_usage>8589934592</max_server_memory_usage>
</yandex>
profiles:
default/max_memory_usage: "8589934592"
方式2: 同时修改Pod资源限制
templates:
podTemplates:
- spec:
containers:
- name: clickhouse
resources:
limits:
memory: 8Gi
requests:
memory: 8Gi
8.2 修改副本数
kubectl edit chi ck-ec818311 -n qfusion-admin
spec:
configuration:
clusters:
- layout:
replicasCount: 3
8.3 启用远程备份
编辑 ConfigMap:
kubectl edit cm chi-ck-ec818311-backup-config -n qfusion-admin
data:
config.yaml: |
general:
remote_storage: s3
backups_to_keep_remote: 7
s3:
bucket: my-clickhouse-backups
endpoint: s3.amazonaws.com
region: us-east-1
access_key: <your-access-key>
secret_key: <your-secret-key>
九、最佳实践与建议
9.1 当前环境特点
| 特性 |
当前配置 |
说明 |
| Operator |
Altinity v0.18.5 |
稳定版本 |
| 存储类型 |
Local PV |
高性能,无网络开销 |
| 副本分布 |
跨节点反亲和 |
高可用 |
| 会话保持 |
3小时 |
适合长连接 |
| 内存限制 |
2.8GB/4Gi |
70%利用率 |
9.2 优化建议
1. 备份配置
- 当前问题:
remote_storage: none,无实际备份
- 建议: 配置S3/Azure/GCS远程存储
2. 副本数
- 当前: 2副本
- 建议: 生产环境建议3副本,提供更好的容错能力
3. 监控集成
- 当前: 未发现 Prometheus ServiceMonitor
- 建议: 添加监控告警
4. 日志持久化
- 当前: 日志使用 emptyDir,Pod重启丢失
- 建议: 考虑配置持久化日志用于审计
5. 资源限制
- 当前: CPU 1核,内存 4Gi
- 建议: 根据实际负载调整,可考虑CPU 2核
9.3 配置文件位置速查
| 配置项 |
文件路径 |
持久化方式 |
| 集群拓扑 |
/etc/clickhouse-server/config.d/chop-generated-remote_servers.xml |
ConfigMap |
| 基础设置 |
/etc/clickhouse-server/config.d/chop-generated-settings.xml |
ConfigMap |
| 内存限制 |
/etc/clickhouse-server/config.d/memory.xml |
ConfigMap |
| 用户配置 |
/etc/clickhouse-server/users.d/ |
ConfigMap |
| ZooKeeper |
/etc/clickhouse-server/conf.d/chop-generated-zookeeper.xml |
ConfigMap |
| 宏变量 |
/etc/clickhouse-server/conf.d/chop-generated-macros.xml |
ConfigMap |
| 备份配置 |
/etc/clickhouse-backup/config.yaml |
ConfigMap |
| 管理员凭证 |
/etc/clickhouse-operator/qfusionadmin/ |
Secret |
| 数据目录 |
/var/lib/clickhouse/ |
PVC (Local PV) |
| 日志目录 |
/var/log/clickhouse-server/ |
emptyDir (临时) |
附录: 连接信息
服务端点
| 连接类型 |
地址 |
| HTTP |
http://ck-ec8183110.qfusion-admin.svc.cluster.local:8123 |
| TCP |
ck-ec8183110.qfusion-admin.svc.cluster.local:9000 |
| 副本0 HTTP |
http://ck-ec818311-replica0-0.qfusion-admin.svc.cluster.local:8123 |
| 副本1 HTTP |
http://ck-ec818311-replica0-1.qfusion-admin.svc.cluster.local:8123 |
管理员凭证
用户名: rdsadmin
密码: xxx
连接示例
curl 'http://ck-ec8183110.qfusion-admin.svc.cluster.local:8123/?query=SELECT%20version()'
kubectl exec -n qfusion-admin ck-ec818311-replica0-0-0 -c clickhouse -- \
clickhouse-client --host=127.0.0.1 --user=rdsadmin --password=xxx
ClickHouse merge_tree_settings 配置分析
环境信息
| 项目 |
值 |
| 实例名称 |
ck-ec818311 |
| ClickHouse版本 |
24.8.7.41 |
| 分析时间 |
2026-01-27 |
一、merge_tree_settings 概述
merge_tree_settings 是 ClickHouse MergeTree 系列表引擎的表级配置参数。这些参数可以在建表时通过 SETTINGS 子句指定,用于控制表的数据合并、分区管理、复制行为等。
配置层级
全局配置 (config.xml)
↓
CHI configuration (defaults)
↓
表级 SETTINGS 子句 (优先级最高)
二、当前环境的 merge_tree_settings 配置
2.1 QFusion 平台全局配置
在 /etc/clickhouse-server/config.d/02-clickhouse-01-qfusion.xml 中定义了以下 MergeTree 相关的全局设置:
<yandex>
<merge_tree_metadata_cache>
<lru_cache_size>268435456</lru_cache_size>
<continue_if_corrupted>true</continue_if_corrupted>
</merge_tree_metadata_cache>
<max_table_size_to_drop>0</max_table_size_to_drop>
<max_partition_size_to_drop>0</max_partition_size_to_drop>
<max_suspicious_broken_parts>50</max_suspicious_broken_parts>
<default_replica_path>/clickhouse/tables/{database}/{table}/{shard}</default_replica_path>
<default_replica_name>{replica}</default_replica_name>
</yandex>
2.2 参数说明
| 参数 |
当前值 |
说明 |
merge_tree_metadata_cache.lru_cache_size |
268435456 (256MB) |
元数据缓存大小 |
merge_tree_metadata_cache.continue_if_corrupted |
true |
元数据损坏时继续加载 |
max_table_size_to_drop |
0 |
删除表的大小限制(0=无限制) |
max_partition_size_to_drop |
0 |
删除分区的大小限制(0=无限制) |
max_suspicious_broken_parts |
50 |
最大可疑损坏分区数 |
三、system.merge_tree_settings 完整��数说明
以下列出当前环境 ClickHouse 24.8.7.41 版本支持的所有 merge_tree_settings 参数及其默认值:
3.1 索引和粒度相关
| 参数 |
默认值 |
说明 |
index_granularity |
8192 |
每个索引标记对应的行数 |
index_granularity_bytes |
10485760 (10MB) |
单个粒度的近似字节数(0=禁用) |
min_index_granularity_bytes |
1024 |
单个粒度的最小字节数 |
enable_mixed_granularity_parts |
true |
启用自适应和非自适应粒度的混合 |
merge_tree_coarse_index_granularity |
8 |
粗粒度索引值 |
3.2 数据合并 (Merge) 相关
| 参数 |
默认值 |
说明 |
max_bytes_to_merge_at_max_space_in_pool |
161061273600 (150GB) |
线程池空闲时的最大合并字节数 |
max_bytes_to_merge_at_min_space_in_pool |
1048576 (1MB) |
线程池忙碌时的最大合并字节数 |
max_parts_to_merge_at_once |
100 |
一次最多合并的分区数 |
merge_max_block_size |
8192 |
合并操作的块大小 |
merge_max_block_size_bytes |
10485760 (10MB) |
合并操作的块字节数 |
merge_selecting_sleep_ms |
5000 |
合并选择任务的休眠时间 |
merge_selecting_sleep_slowdown_factor |
1.2 |
合并选择休眠时间调整因子 |
max_merge_selecting_sleep_ms |
60000 |
合并选择的最大休眠时间 |
number_of_free_entries_in_pool_to_lower_max_size_of_merge |
8 |
降低合并大小的空闲线程池阈值 |
background_task_preferred_step_execution_time_ms |
50 |
后台任务期望的执行时间 |
3.3 副本数控制 (防止 “Too many parts”)
| 参数 |
默认值 |
说明 |
parts_to_delay_insert |
1000 |
触发INSERT延迟的分区数阈值 |
parts_to_throw_insert |
3000 |
触发INSERT异常的分区数阈值 |
max_parts_in_total |
100000 |
总分区数上限 |
inactive_parts_to_delay_insert |
0 |
触发延迟的非活跃分区数 |
inactive_parts_to_throw_insert |
0 |
触发异常的非活跃分区数 |
max_avg_part_size_for_too_many_parts |
1073741824 (1GB) |
"太多分区"检查的平均分区大小阈值 |
3.4 TTL 合并相关
| 参数 |
默认值 |
说明 |
merge_with_ttl_timeout |
14400 (4小时) |
TTL合并的最小间隔时间 |
max_number_of_merges_with_ttl_in_pool |
2 |
线程池中允许的TTL合并任务数上限 |
3.5 变更 (Mutation) 相关
| 参数 |
默认值 |
说明 |
max_number_of_mutations_for_replica |
0 |
每个副本的变更任务数限制(0=无限制) |
max_delay_to_mutate_ms |
1000 |
变更的最大延迟时间 |
min_delay_to_mutate_ms |
10 |
变更的最小延迟时间 |
number_of_mutations_to_delay |
500 |
触发变更延迟的积压任务数 |
number_of_mutations_to_throw |
1000 |
触发变更异常的积压任务数 |
number_of_free_entries_in_pool_to_execute_mutation |
20 |
执行变更的空闲线程池阈值 |
max_postpone_time_for_failed_mutations_ms |
300000 (5分钟) |
失败变更的最大推迟时间 |
3.6 复制 (Replication) 相关
| 参数 |
默认值 |
说明 |
replicated_can_become_leader |
true |
是否允许成为leader副本 |
replicated_deduplication_window |
1000 |
ZooKeeper中保留的去重哈希块数量 |
replicated_deduplication_window_seconds |
604800 (7天) |
基于时间的去重窗口 |
max_replicated_merges_in_queue |
1000 |
复制队列中的合并任务数上限 |
max_replicated_mutations_in_queue |
8 |
复制队列中的变更任务数上限 |
max_replicated_merges_with_ttl_in_queue |
1 |
复制队列中的TTL合并任务数上限 |
max_replicated_logs_to_keep |
1000 |
非活跃副本保留的日志条目数 |
min_replicated_logs_to_keep |
10 |
ZooKeeper中最少保留的日志条目数 |
replicated_max_ratio_of_wrong_parts |
0.5 |
允许启动的错误分区比例阈值 |
3.7 数据格式相关
| 参数 |
默认值 |
说明 |
min_bytes_for_wide_part |
10485760 (10MB) |
创建wide格式分区的最小字节数 |
min_rows_for_wide_part |
0 |
创建wide格式分区的最小行数 |
compress_marks |
true |
压缩标记文件 |
compress_primary_key |
true |
压缩主键文件 |
marks_compress_block_size |
65536 |
标记压缩块大小 |
marks_compression_codec |
ZSTD(3) |
标记压缩算法 |
primary_key_compress_block_size |
65536 |
主键压缩块大小 |
primary_key_compression_codec |
ZSTD(3) |
主键压缩算法 |
primary_key_lazy_load |
true |
延迟加载主键 |
3.8 垂直合并相关
| 参数 |
默认值 |
说明 |
enable_vertical_merge_algorithm |
true |
启用垂直合并算法 |
vertical_merge_algorithm_min_bytes_to_activate |
0 |
激活垂直合并的最小字节数 |
vertical_merge_algorithm_min_rows_to_activate |
131072 |
激活垂直合并的最小行数 |
vertical_merge_algorithm_min_columns_to_activate |
11 |
激活垂直合并的最小列数 |
allow_vertical_merges_from_compact_to_wide_parts |
true |
允许从compact到wide的垂直合并 |
3.9 存储策略相关
| 参数 |
默认值 |
说明 |
storage_policy |
default |
存储策略名称 |
disk |
- |
存储磁盘名称(可替代storage_policy) |
3.10 清理和维护相关
| 参数 |
默认值 |
说明 |
old_parts_lifetime |
480 (8分钟) |
保留旧分区的时间 |
temporary_directories_lifetime |
86400 (24小时) |
保留临时目录的时间 |
cleanup_delay_period |
30 |
清理任务的最小间隔时间 |
max_cleanup_delay_period |
300 (5分钟) |
清理任务的最大间隔时间 |
merge_tree_clear_old_parts_interval_seconds |
1 |
清理旧分区的执行间隔 |
merge_tree_clear_old_temporary_directories_interval_seconds |
60 |
清理临时目录的执行间隔 |
concurrent_part_removal_threshold |
100 |
激活并发分区移除的阈值 |
3.11 ZooKeeper 相关
| 参数 |
默认值 |
说明 |
use_minimalistic_checksums_in_zookeeper |
true |
在ZK中使用最小校验和格式 |
use_minimalistic_part_header_in_zookeeper |
true |
在ZK中使用最小分区头格式 |
zookeeper_session_expiration_check_period |
60 |
ZK会话过期检查周期 |
3.12 实验性功能
| 参数 |
默认值 |
说明 |
allow_experimental_shared_merge_tree |
false |
允许实验性共享MergeTree |
allow_experimental_replacing_merge_with_cleanup |
false |
允许实验性CLEANUP合并 |
四、表级配置示例
4.1 创建表时指定 merge_tree_settings
CREATE TABLE my_table
(
id UInt64,
event_time DateTime,
data String
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(event_time)
ORDER BY (id, event_time)
SETTINGS
max_bytes_to_merge_at_max_space_in_pool = 107374182400,
max_parts_to_merge_at_once = 50,
parts_to_delay_insert = 500,
parts_to_throw_insert = 1500,
merge_with_ttl_timeout = 7200,
min_bytes_for_wide_part = 20971520,
storage_policy = 'default';
4.2 ReplicatedMergeTree 表配置示例
CREATE TABLE replicated_table
(
date Date,
id UInt32,
value UInt64
)
ENGINE = ReplicatedMergeTree(
'/clickhouse/tables/{shard}/replicated_table',
'{replica}'
)
PARTITION BY toYYYYMM(date)
ORDER BY (date, id)
SETTINGS
replicated_deduplication_window = 2000,
replicated_max_ratio_of_wrong_parts = 0.3,
max_replicated_merges_in_queue = 500,
max_replicated_mutations_in_queue = 16,
inactive_parts_to_delay_insert = 100,
inactive_parts_to_throw_insert = 500;
4.3 修改现有表的设置
ALTER TABLE my_table
MODIFY SETTING max_bytes_to_merge_at_max_space_in_pool = 214748364800;
五、配置建议
5.1 高吞吐写入场景
SETTINGS
parts_to_delay_insert = 3000,
parts_to_throw_insert = 10000,
max_parts_to_merge_at_once = 200;
5.2 大数据量查询场景
SETTINGS
min_bytes_for_wide_part = 52428800,
index_granularity = 16384,
index_granularity_bytes = 20971520;
5.3 频繁更新/删除场景
SETTINGS
max_number_of_mutations_for_replica = 100,
number_of_mutations_to_delay = 1000,
number_of_mutations_to_throw = 2000;
5.4 低存储成本场景
SETTINGS
marks_compression_codec = 'ZSTD(5)',
primary_key_compression_codec = 'ZSTD(5)',
min_bytes_for_wide_part = 5242880;
六、当前环境配置评估
6.1 优点
- 合理的元数据缓存: 256MB的元数据缓存适合中小规模表
- 删除保护已禁用: max_table_size_to_drop=0 允许删除任意大小的表
- 损坏容错: continue_if_corrupted=true 在元数据损坏时仍能启动
6.2 可优化项
- 未配置表级 merge_tree_settings: 建议根据业务特点在表级别配置
- 副本数阈值可能较低: parts_to_delay_insert=1000 对于高并发写入可能触发过早
- 未启用实验性功能: 根据需求考虑启用 allow_experimental_shared_merge_tree
6.3 配置对比
| 配置项 |
当前全局默认 |
高吞吐推荐 |
大查询推荐 |
| parts_to_delay_insert |
1000 |
3000-5000 |
1000 |
| parts_to_throw_insert |
3000 |
10000-15000 |
3000 |
| max_parts_to_merge_at_once |
100 |
200-300 |
50 |
| min_bytes_for_wide_part |
10MB |
10MB |
50MB |
| index_granularity |
8192 |
8192 |
16384 |
七、常见问题排查
7.1 “Too many parts” 错误
SELECT
database,
table,
partition,
count() AS parts_count,
sum(rows) AS total_rows,
formatReadableSize(sum(bytes_on_disk)) AS total_size
FROM system.parts
WHERE active AND database != 'system'
GROUP BY database, table, partition
ORDER BY parts_count DESC
LIMIT 10;
ALTER TABLE my_table MODIFY SETTING
parts_to_delay_insert = 2000,
max_parts_to_merge_at_once = 150;
7.2 合并任务积压
SELECT * FROM system.replication_queue
WHERE type = 'MERGE_PART'
ORDER BY create_time DESC
LIMIT 10;
7.3 TTL 合并不及时
SELECT
database,
table,
name,
ttl_expression,
max_ttl_part
FROM system.ttls
ORDER BY database, table;
ALTER TABLE my_table MODIFY SETTING
merge_with_ttl_timeout = 3600;
八、参考资料
所有评论(0)