本文是《销量预测模型的自动化部署与监控——基于Docker+GitHub Actions》的进阶篇，聚焦如何将单机Docker部署升级为Kubernetes集群部署，实现自动伸缩和高可用。包含完整的K

本文是《销量预测模型的自动化部署与监控——基于Docker+GitHub Actions》的进阶篇，聚焦如何将单机Docker部署升级为Kubernetes集群部署，实现自动伸缩和高可用。Kubernetes（简称K8s）是Google开源的容器编排平台，能自动化部署、扩缩容、负载均衡、日志收集等，让你的服务具备企业级的稳定性和弹性。Helm是Kubernetes的包管理器，类似apt/yum，让

阳明山水

497人浏览 · 2026-04-21 19:44:15

阳明山水 · 2026-04-21 19:44:15 发布

模型服务的水平扩展：从单机到Kubernetes——基于K8s的销量预测服务高可用部署

本文是《销量预测模型的自动化部署与监控——基于Docker+GitHub Actions》的进阶篇，聚焦如何将单机Docker部署升级为Kubernetes集群部署，实现自动伸缩和高可用。包含完整的Kubernetes配置文件、Helm Chart模板、以及生产级最佳实践。

一、为什么需要Kubernetes？

从Docker Compose到Kubernetes的演进路径

在前篇文章中，我们使用Docker Compose实现了API + Prometheus + Grafana的单机部署。但随着业务增长，会遇到以下瓶颈：

问题	Docker Compose	Kubernetes
扩缩容	手动修改副本数	HorizontalPodAutoscaler自动伸缩
滚动更新	手动docker-compose down/up	Rolling Update零停机
负载均衡	依赖外部Nginx	内置Service负载均衡
故障恢复	需监控脚本	自动重启、自动迁移
跨主机部署	需Docker Swarm	原生支持多节点集群
配置管理	环境变量散落	ConfigMap + Secrets集中管理

Kubernetes（简称K8s）是Google开源的容器编排平台，能自动化部署、扩缩容、负载均衡、日志收集等，让你的服务具备企业级的稳定性和弹性。

二、Kubernetes核心概念速览

在开始实战前，先理解几个核心概念：

Pod：K8s的最小调度单位，一个Pod包含一个或多个容器（通常是一个）
Deployment：管理Pod副本数、滚动更新、回滚的抽象
Service：为一组Pod提供稳定的访问入口，支持负载均衡
Ingress：管理外部HTTP/HTTPS访问
ConfigMap/Secret：存储配置和敏感信息
HorizontalPodAutoscaler (HPA)：根据CPU/内存自动调整Pod副本数

三、第一步：改造应用为云原生架构

为了让应用更好地在K8s中运行，需要做以下改造：

1. 支持优雅停机

# app.py 添加信号处理
import signal
import sys

def graceful_shutdown(signum, frame):
    print("收到终止信号，开始优雅关闭...")
    # 清理资源，如关闭数据库连接
    sys.exit(0)

signal.signal(signal.SIGTERM, graceful_shutdown)
signal.signal(signal.SIGINT, graceful_shutdown)

2. 支持健康检查可配置

import os

liveness_path = os.environ.get('HEALTH_CHECK_PATH', '/health')
readiness_path = os.environ.get('READINESS_CHECK_PATH', '/ready')

@app.route(liveness_path, methods=['GET'])
def liveness():
    """存活探针：应用是否存活"""
    return jsonify({'status': 'ok'})

@app.route(readiness_path, methods=['GET'])
def readiness():
    """就绪探针：应用是否可以接收流量"""
    if model is None:
        return jsonify({'status': 'not ready', 'reason': 'model not loaded'}), 503
    return jsonify({'status': 'ready'})

四、第二步：编写Kubernetes配置文件

目录结构

k8s/
├── namespace.yaml      # 命名空间
├── configmap.yaml      # 配置
├── deployment.yaml     # Deployment配置
├── service.yaml        # Service配置
├── ingress.yaml        # Ingress配置（可选）
├── hpa.yaml           # 自动伸缩配置
└── secret.yaml        # 密钥（生产环境使用外部Secret）

1. 创建命名空间（namespace.yaml）

apiVersion: v1
kind: Namespace
metadata:
  name: retail-forecast
  labels:
    app: retail-forecast
    environment: production

2. 配置管理（configmap.yaml）

apiVersion: v1
kind: ConfigMap
metadata:
  name: retail-forecast-config
  namespace: retail-forecast
data:
  HEALTH_CHECK_PATH: "/health"
  READINESS_CHECK_PATH: "/ready"
  LOG_LEVEL: "info"
  MODEL_CACHE_TTL: "3600"
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: retail-forecast
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s
      evaluation_interval: 15s
    scrape_configs:
      - job_name: 'retail-forecast-api'
        kubernetes_sd_configs:
          - role: pod
        relabel_configs:
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
            action: keep
            regex: true

3. Deployment配置（deployment.yaml）

apiVersion: apps/v1
kind: Deployment
metadata:
  name: retail-forecast-api
  namespace: retail-forecast
  labels:
    app: retail-forecast
    tier: api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: retail-forecast
  # 滚动更新策略
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1           # 最多超出期望副本数1个
      maxUnavailable: 0     # 滚动过程中始终保持0个不可用
  
  # Pod模板
  template:
    metadata:
      labels:
        app: retail-forecast
        tier: api
      annotations:
        prometheus.io/scrape: "true"  # Prometheus自动发现
        prometheus.io/port: "5000"
        prometheus.io/path: "/metrics"
    spec:
      # 优雅终止时间
      terminationGracePeriodSeconds: 30
      
      containers:
        - name: api
          image: ghcr.io/yourusername/retail-forecast:latest
          imagePullPolicy: Always
          ports:
            - name: http
              containerPort: 5000
              protocol: TCP
          
          # 资源限制（防止单个Pod耗尽集群资源）
          resources:
            requests:
              cpu: "100m"
              memory: "256Mi"
            limits:
              cpu: "500m"
              memory: "1Gi"
          
          # 环境变量
          envFrom:
            - configMapRef:
                name: retail-forecast-config
          
          # 健康检查
          livenessProbe:
            httpGet:
              path: /health
              port: http
            initialDelaySeconds: 10
            periodSeconds: 15
            timeoutSeconds: 5
            failureThreshold: 3
          
          readinessProbe:
            httpGet:
              path: /ready
              port: http
            initialDelaySeconds: 5
            periodSeconds: 10
            timeoutSeconds: 3
            failureThreshold: 3
          
          # 启动探针（冷启动时等待模型加载）
          startupProbe:
            httpGet:
              path: /ready
              port: http
            initialDelaySeconds: 0
            periodSeconds: 5
            failureThreshold: 30  # 最多等待30*5=150秒启动
          
          # 生命周期钩子
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh", "-c", "sleep 10"]  # 等待kube-proxy更新

4. Service配置（service.yaml）

apiVersion: v1
kind: Service
metadata:
  name: retail-forecast-api
  namespace: retail-forecast
  labels:
    app: retail-forecast
spec:
  type: ClusterIP  # 集群内部访问
  ports:
    - name: http
      port: 80          # Service端口
      targetPort: 5000  # Pod端口
      protocol: TCP
  selector:
    app: retail-forecast
---
# NodePort类型Service（用于开发测试）
apiVersion: v1
kind: Service
metadata:
  name: retail-forecast-api-nodeport
  namespace: retail-forecast
spec:
  type: NodePort
  ports:
    - port: 80
      targetPort: 5000
      nodePort: 30080  # 固定NodePort
  selector:
    app: retail-forecast

5. Ingress配置（ingress.yaml）

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: retail-forecast-ingress
  namespace: retail-forecast
  annotations:
    # Nginx Ingress Controller配置
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/proxy-body-size: "10m"
    nginx.ingress.kubernetes.io/proxy-connect-timeout: "30"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "120"
    nginx.ingress.kubernetes.io/rate-limit: "100"  # 限流100请求/秒
spec:
  ingressClassName: nginx
  tls:
    - hosts:
        - api.forecast.example.com
      secretName: forecast-tls-secret
  rules:
    - host: api.forecast.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: retail-forecast-api
                port:
                  number: 80

6. 自动伸缩配置（hpa.yaml）

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: retail-forecast-hpa
  namespace: retail-forecast
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: retail-forecast-api
  # 副本数范围
  minReplicas: 2
  maxReplicas: 10
  # 伸缩指标
  metrics:
    # CPU使用率（目标70%）
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    # 内存使用率（目标80%）
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80
  # 冷却时间（避免频繁伸缩）
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300  # 缩容等待5分钟
      policies:
        - type: Percent
          value: 10
          periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0    # 扩容立即响应
      policies:
        - type: Percent
          value: 100
          periodSeconds: 15

五、第三步：使用Helm简化部署

Helm是Kubernetes的包管理器，类似apt/yum，让你一键部署复杂应用。

1. 创建Helm Chart结构

helm create retail-forecast

生成的目录结构：

retail-forecast/
├── Chart.yaml
├── values.yaml
├── templates/
│   ├── deployment.yaml
│   ├── service.yaml
│   ├── ingress.yaml
│   ├── hpa.yaml
│   └── _helpers.tpl
└── .helmignore

2. 优化values.yaml

# values.yaml

replicaCount: 3

image:
  repository: ghcr.io/yourusername/retail-forecast
  pullPolicy: IfNotPresent
  tag: "latest"

imagePullSecrets: []
# - name: ghcr-secret

nameOverride: ""
fullnameOverride: ""

serviceAccount:
  create: true
  annotations: {}
  name: ""

podAnnotations:
  prometheus.io/scrape: "true"
  prometheus.io/port: "5000"
  prometheus.io/path: "/metrics"

service:
  type: ClusterIP
  port: 80

ingress:
  enabled: true
  className: "nginx"
  annotations:
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
  hosts:
    - host: api.forecast.example.com
      paths:
        - path: /
          pathType: Prefix
          service:
            port: 80
  tls:
    - secretName: forecast-tls-secret
      hosts:
        - api.forecast.example.com

resources:
  limits:
    cpu: 500m
    memory: 1Gi
  requests:
    cpu: 100m
    memory: 256Mi

autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70
  targetMemoryUtilizationPercentage: 80

nodeSelector: {}

tolerations: []

affinity: {}

# 监控组件
monitoring:
  enabled: true
  prometheus:
    enabled: true
    namespace: monitoring
  grafana:
    enabled: true
    namespace: monitoring

3. 一键部署

# 添加监控仓库
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

# 部署应用
helm install retail-forecast ./retail-forecast -n retail-forecast --create-namespace

# 部署监控（可选）
helm install prometheus prometheus-community/kube-prometheus-stack \
  -n monitoring --create-namespace

# 升级（如更新配置）
helm upgrade retail-forecast ./retail-forecast -n retail-forecast

# 回滚
helm rollback retail-forecast -n retail-forecast

六、第四步：GitHub Actions集成K8s部署

更新deploy.yml

# .github/workflows/deploy-k8s.yml
name: Deploy to Kubernetes

on:
  push:
    branches: [main]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Login to GitHub Container Registry
        uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Extract metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
          tags: |
            type=sha,format=long
            type=raw,value=latest

      - name: Build and push
        uses: docker/build-push-action@v5
        with:
          context: .
          push: true
          tags: ${{ steps.meta.outputs.tags }}

      - name: Setup Kubeconfig
        uses: azure/k8s-set-context@v3
        with:
          kubeconfig: ${{ secrets.KUBE_CONFIG }}

      - name: Deploy to Kubernetes
        run: |
          # 使用sed替换镜像版本
          TAG=$(echo "${{ steps.meta.outputs.tags }}" | cut -d':' -f2)
          sed -i "s|image: .*|image: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${TAG}|" k8s/deployment.yaml
          
          # 应用配置
          kubectl apply -f k8s/namespace.yaml
          kubectl apply -f k8s/configmap.yaml
          kubectl apply -f k8s/deployment.yaml
          kubectl apply -f k8s/service.yaml
          kubectl apply -f k8s/ingress.yaml
          kubectl apply -f k8s/hpa.yaml
          
          # 等待滚动更新完成
          kubectl rollout status deployment/retail-forecast-api -n retail-forecast
          kubectl get pods -n retail-forecast

七、第五步：生产级监控与告警

PrometheusRule配置

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: retail-forecast-alerts
  namespace: retail-forecast
spec:
  groups:
    - name: retail-forecast
      rules:
        # Pod CPU使用率过高
        - alert: HighCPUUsage
          expr: |
            sum(rate(container_cpu_usage_seconds_total{
              pod=~"retail-forecast.*"}[5m])) by (pod)
            / on(pod) group_left()
            kube_pod_container_resource_limits{resource="cpu"}
            > 0.8
          for: 5m
          labels:
            severity: warning
          annotations:
            summary: "Pod CPU使用率超过80%"
            description: "Pod {{ $labels.pod }} CPU使用率过高"

        # Pod重启过多
        - alert: PodRestartingTooMuch
          expr: |
            sum(kube_pod_container_status_restarts_total{
              pod=~"retail-forecast.*"}) by (pod) > 3
          for: 5m
          labels:
            severity: warning
          annotations:
            summary: "Pod重启次数过多"
            description: "Pod {{ $labels.pod }} 在5分钟内重启超过3次"

        # HPA达到最大副本数
        - alert: HPAAtMaximumReplicas
          expr: |
            kube_hpa_status_current_replicas{
              name="retail-forecast-hpa"} 
            == kube_hpa_spec_max_replicas{
              name="retail-forecast-hpa"}
          for: 10m
          labels:
            severity: critical
          annotations:
            summary: "HPA达到最大副本数"
            description: "HPA已达最大副本数{{ $value }}，可能需要扩容"

        # 预测延迟过高
        - alert: HighPredictionLatency
          expr: |
            histogram_quantile(0.99, 
              rate(model_prediction_duration_seconds_bucket[5m])) > 1
          for: 5m
          labels:
            severity: warning
          annotations:
            summary: "预测延迟过高"
            description: "P99延迟超过1秒，当前值: {{ $value }}s"

八、常见问题与避坑指南

Q1：Pod一直处于Pending状态？

检查PVC（持久卷）是否满足，或者节点资源是否足够：kubectl describe pod <pod-name> -n retail-forecast

Q2：滚动更新时服务不可用？

确保readinessProbe配置正确，且preStop钩子有足够延迟让kube-proxy更新。

Q3：HPA不生效？

确保metrics-server已安装并运行：kubectl top pods -n retail-forecast

Q4：如何查看Pod日志？

kubectl logs -f deployment/retail-forecast-api -n retail-forecast
kubectl logs -f <pod-name> -n retail-forecast --previous  # 查看重启前的日志

九、性能对比：Docker Compose vs Kubernetes

指标	Docker Compose	Kubernetes
部署复杂度	简单	中等（需要集群）
扩缩容速度	分钟级手动	秒级自动
故障恢复	依赖健康检查脚本	原生支持
最大并发	受单机资源限制	理论上无限
成本	低（单台服务器）	中高（多节点集群）
适用场景	开发、小规模生产	中大规模生产

十、完整代码仓库

仓库地址：https://github.com/yourusername/retail-forecast-k8s

包含内容：

完整Kubernetes配置文件
Helm Chart模板
GitHub Actions K8s部署流水线
Prometheus告警规则
完整的Grafana Dashboard JSON

下一篇预告：《模型监控进阶：数据漂移检测与自动重训练》——让模型自己"感知"到性能下降

如果你在部署过程中遇到问题，欢迎私信交流！
「往期文章推荐 + 关注我」

腾讯云开发者社区

腾讯云面向开发者汇聚海量精品云计算使用和开发经验，营造开放的云计算技术生态圈。

更多推荐

终极指南：Flink SQL连接器版本管理从混乱到有序的升级之路

Apache Flink作为流处理领域的佼佼者，其SQL连接器的版本管理一直是开发者面临的核心挑战。本文将系统讲解Flink SQL连接器版本管理的最佳实践，帮助你轻松应对版本兼容性问题，实现从混乱到有序的升级之旅。## 连接器版本管理的常见痛点 😫在Flink应用开发中，连接器版本管理常常让开发者头疼不已。不同版本的连接器可能导致各种兼容性问题，例如API变更、功能差异甚至运行时错误。

腾讯云开发者社区

Elasticsearch复杂数据类型终极指南：从入门到精通

Elasticsearch作为功能强大的搜索引擎，支持多种复杂数据类型，让开发者能够灵活处理各种结构化和非结构化数据。本文将带你全面了解Elasticsearch中的复杂数据类型，从基础概念到实际应用，助你轻松掌握数据建模的核心技巧。## 内部对象：构建层级化数据结构在Elasticsearch中，对象类型（Object）是最基础的复杂数据类型之一，用于表示具有嵌套关系的数据。例如，我们可

腾讯云开发者社区

如何快速搭建Neon无服务器PostgreSQL：面向初学者的完整指南

Neon是一款革命性的无服务器PostgreSQL解决方案，它通过分离存储和计算层，实现了自动扩缩容、类代码式数据库分支以及零级扩展能力。本指南将帮助你从零开始搭建Neon开发环境，体验这款创新数据库的强大功能。## 准备工作：环境要求与依赖项在开始搭建Neon环境前，请确保你的系统满足以下要求：- Linux操作系统（推荐Ubuntu 20.04+或Debian 11+）- Git