PostgreSQL 云原生化:基于 Kubernetes 部署主从集群(使用 Patroni)

核心概念
  1. Patroni 作用
    自动管理 PostgreSQL 高可用集群,处理故障转移、配置同步和节点监控。
  2. Kubernetes 集成
    利用 StatefulSet 管理有状态应用,通过 Headless Service 实现节点发现。

部署步骤
1. 准备环境
  • 存储类(StorageClass)
    确保已配置支持动态卷的存储类(如 gp2csi-rbd)。
  • ETCD 集群
    Patroni 依赖 ETCD 存储集群状态(需独立部署)。
2. 创建配置文件

ConfigMap (patroni.yaml)

apiVersion: v1
kind: ConfigMap
metadata:
  name: patroni-config
data:
  patroni.yml: |
    scope: pg-cluster
    name: '$(HOSTNAME)'
    restapi:
      listen: 0.0.0.0:8008
      connect_address: '$(POD_IP):8008'
    etcd:
      hosts: "etcd-0.etcd:2379,etcd-1.etcd:2379,etcd-2.etcd:2379"
    bootstrap:
      dcs:
        ttl: 30
        loop_wait: 10
        retry_timeout: 10
        postgresql:
          use_pg_rewind: true
          parameters:
            wal_level: logical
            max_connections: 200
    postgresql:
      listen: 0.0.0.0:5432
      connect_address: '$(POD_IP):5432'
      data_dir: /var/lib/postgresql/data
      authentication:
        replication:
          username: replicator
          password: "REPLICATOR_PASS"
        superuser:
          username: postgres
          password: "ADMIN_PASS"

3. 创建 Secrets
apiVersion: v1
kind: Secret
metadata:
  name: postgres-secrets
stringData:
  REPLICATOR_PASS: "securepassword123"
  ADMIN_PASS: "adminsecure456"

4. 部署 StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres-patroni
spec:
  serviceName: postgres-headless
  replicas: 3
  selector:
    matchLabels:
      app: postgres-patroni
  template:
    metadata:
      labels:
        app: postgres-patroni
    spec:
      containers:
      - name: postgres
        image: postgres:14
        env:
        - name: POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        - name: HOSTNAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        ports:
        - containerPort: 5432
          name: postgresql
        - containerPort: 8008
          name: patroni-api
        volumeMounts:
        - name: data
          mountPath: /var/lib/postgresql/data
        - name: config
          mountPath: /etc/patroni
        command: ["patroni"]
        args: ["/etc/patroni/patroni.yml"]
      volumes:
      - name: config
        configMap:
          name: patroni-config
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "gp2"
      resources:
        requests:
          storage: 10Gi

5. 配置 Services

Headless Service (节点发现)

apiVersion: v1
kind: Service
metadata:
  name: postgres-headless
spec:
  clusterIP: None
  ports:
  - port: 5432
    name: postgresql
  selector:
    app: postgres-patroni

读写 Service (指向主节点)

apiVersion: v1
kind: Service
metadata:
  name: postgres-master
spec:
  ports:
  - port: 5432
    name: postgresql
  selector:
    app: postgres-patroni
    role: master  # Patroni 自动添加标签


验证部署
  1. 检查集群状态

    kubectl exec postgres-patroni-0 -- patronictl list
    

    输出示例:

    + Cluster: pg-cluster (7181021020637368913) --+----+-----------+
    | Member           | Host     | Role    | State   | TL | Lag in MB |
    +------------------+----------+---------+---------+----+-----------+
    | postgres-patroni-0 | 10.1.0.5 | Leader  | running | 1  |           |
    | postgres-patroni-1 | 10.1.0.6 | Replica | running | 1  |         0 |
    | postgres-patroni-2 | 10.1.0.7 | Replica | running | 1  |         0 |
    +------------------+----------+---------+---------+----+-----------+
    

  2. 故障转移测试
    删除主节点 Pod:

    kubectl delete pod postgres-patroni-0
    

    观察 Patroni 自动选举新主节点(约 30 秒内完成)。


关键优化项
  1. 反亲和性(避免节点共置)
    在 StatefulSet 中添加:

    affinity:
      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              app: postgres-patroni
          topologyKey: kubernetes.io/hostname
    

  2. 资源限制

    resources:
      requests:
        memory: "2Gi"
        cpu: "500m"
      limits:
        memory: "4Gi"
        cpu: "2"
    

  3. 监控集成
    通过 Prometheus 收集指标:

    - containerPort: 9187  # postgres_exporter 端口
      name: metrics
    


常见问题处理
  • 启动卡住:检查 ETCD 连通性,确保 etcd-hosts 配置正确。
  • 数据不同步:验证 replicator 密码在 Secrets 中一致性。
  • 存储卷挂载失败:确认 StorageClass 名称与集群匹配。

注意:生产环境需启用 SSL 加密、定期备份(如 pgBackRest)和网络策略隔离。

Logo

腾讯云面向开发者汇聚海量精品云计算使用和开发经验,营造开放的云计算技术生态圈。

更多推荐