本文是《销量预测模型的自动化部署与监控——基于Docker+GitHub Actions》的进阶篇,聚焦如何将单机Docker部署升级为Kubernetes集群部署,实现自动伸缩和高可用。包含完整的K
本文是《销量预测模型的自动化部署与监控——基于Docker+GitHub Actions》的进阶篇,聚焦如何将单机Docker部署升级为Kubernetes集群部署,实现自动伸缩和高可用。Kubernetes(简称K8s)是Google开源的容器编排平台,能自动化部署、扩缩容、负载均衡、日志收集等,让你的服务具备企业级的稳定性和弹性。Helm是Kubernetes的包管理器,类似apt/yum,让
模型服务的水平扩展:从单机到Kubernetes——基于K8s的销量预测服务高可用部署
本文是《销量预测模型的自动化部署与监控——基于Docker+GitHub Actions》的进阶篇,聚焦如何将单机Docker部署升级为Kubernetes集群部署,实现自动伸缩和高可用。包含完整的Kubernetes配置文件、Helm Chart模板、以及生产级最佳实践。
一、为什么需要Kubernetes?
从Docker Compose到Kubernetes的演进路径
在前篇文章中,我们使用Docker Compose实现了API + Prometheus + Grafana的单机部署。但随着业务增长,会遇到以下瓶颈:
| 问题 | Docker Compose | Kubernetes |
|---|---|---|
| 扩缩容 | 手动修改副本数 | HorizontalPodAutoscaler自动伸缩 |
| 滚动更新 | 手动docker-compose down/up | Rolling Update零停机 |
| 负载均衡 | 依赖外部Nginx | 内置Service负载均衡 |
| 故障恢复 | 需监控脚本 | 自动重启、自动迁移 |
| 跨主机部署 | 需Docker Swarm | 原生支持多节点集群 |
| 配置管理 | 环境变量散落 | ConfigMap + Secrets集中管理 |
Kubernetes(简称K8s)是Google开源的容器编排平台,能自动化部署、扩缩容、负载均衡、日志收集等,让你的服务具备企业级的稳定性和弹性。
二、Kubernetes核心概念速览
在开始实战前,先理解几个核心概念:
- Pod:K8s的最小调度单位,一个Pod包含一个或多个容器(通常是一个)
- Deployment:管理Pod副本数、滚动更新、回滚的抽象
- Service:为一组Pod提供稳定的访问入口,支持负载均衡
- Ingress:管理外部HTTP/HTTPS访问
- ConfigMap/Secret:存储配置和敏感信息
- HorizontalPodAutoscaler (HPA):根据CPU/内存自动调整Pod副本数
三、第一步:改造应用为云原生架构
为了让应用更好地在K8s中运行,需要做以下改造:
1. 支持优雅停机
# app.py 添加信号处理
import signal
import sys
def graceful_shutdown(signum, frame):
print("收到终止信号,开始优雅关闭...")
# 清理资源,如关闭数据库连接
sys.exit(0)
signal.signal(signal.SIGTERM, graceful_shutdown)
signal.signal(signal.SIGINT, graceful_shutdown)
2. 支持健康检查可配置
import os
liveness_path = os.environ.get('HEALTH_CHECK_PATH', '/health')
readiness_path = os.environ.get('READINESS_CHECK_PATH', '/ready')
@app.route(liveness_path, methods=['GET'])
def liveness():
"""存活探针:应用是否存活"""
return jsonify({'status': 'ok'})
@app.route(readiness_path, methods=['GET'])
def readiness():
"""就绪探针:应用是否可以接收流量"""
if model is None:
return jsonify({'status': 'not ready', 'reason': 'model not loaded'}), 503
return jsonify({'status': 'ready'})
四、第二步:编写Kubernetes配置文件
目录结构
k8s/
├── namespace.yaml # 命名空间
├── configmap.yaml # 配置
├── deployment.yaml # Deployment配置
├── service.yaml # Service配置
├── ingress.yaml # Ingress配置(可选)
├── hpa.yaml # 自动伸缩配置
└── secret.yaml # 密钥(生产环境使用外部Secret)
1. 创建命名空间(namespace.yaml)
apiVersion: v1
kind: Namespace
metadata:
name: retail-forecast
labels:
app: retail-forecast
environment: production
2. 配置管理(configmap.yaml)
apiVersion: v1
kind: ConfigMap
metadata:
name: retail-forecast-config
namespace: retail-forecast
data:
HEALTH_CHECK_PATH: "/health"
READINESS_CHECK_PATH: "/ready"
LOG_LEVEL: "info"
MODEL_CACHE_TTL: "3600"
---
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: retail-forecast
data:
prometheus.yml: |
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'retail-forecast-api'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
3. Deployment配置(deployment.yaml)
apiVersion: apps/v1
kind: Deployment
metadata:
name: retail-forecast-api
namespace: retail-forecast
labels:
app: retail-forecast
tier: api
spec:
replicas: 3
selector:
matchLabels:
app: retail-forecast
# 滚动更新策略
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # 最多超出期望副本数1个
maxUnavailable: 0 # 滚动过程中始终保持0个不可用
# Pod模板
template:
metadata:
labels:
app: retail-forecast
tier: api
annotations:
prometheus.io/scrape: "true" # Prometheus自动发现
prometheus.io/port: "5000"
prometheus.io/path: "/metrics"
spec:
# 优雅终止时间
terminationGracePeriodSeconds: 30
containers:
- name: api
image: ghcr.io/yourusername/retail-forecast:latest
imagePullPolicy: Always
ports:
- name: http
containerPort: 5000
protocol: TCP
# 资源限制(防止单个Pod耗尽集群资源)
resources:
requests:
cpu: "100m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "1Gi"
# 环境变量
envFrom:
- configMapRef:
name: retail-forecast-config
# 健康检查
livenessProbe:
httpGet:
path: /health
port: http
initialDelaySeconds: 10
periodSeconds: 15
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: http
initialDelaySeconds: 5
periodSeconds: 10
timeoutSeconds: 3
failureThreshold: 3
# 启动探针(冷启动时等待模型加载)
startupProbe:
httpGet:
path: /ready
port: http
initialDelaySeconds: 0
periodSeconds: 5
failureThreshold: 30 # 最多等待30*5=150秒启动
# 生命周期钩子
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 10"] # 等待kube-proxy更新
4. Service配置(service.yaml)
apiVersion: v1
kind: Service
metadata:
name: retail-forecast-api
namespace: retail-forecast
labels:
app: retail-forecast
spec:
type: ClusterIP # 集群内部访问
ports:
- name: http
port: 80 # Service端口
targetPort: 5000 # Pod端口
protocol: TCP
selector:
app: retail-forecast
---
# NodePort类型Service(用于开发测试)
apiVersion: v1
kind: Service
metadata:
name: retail-forecast-api-nodeport
namespace: retail-forecast
spec:
type: NodePort
ports:
- port: 80
targetPort: 5000
nodePort: 30080 # 固定NodePort
selector:
app: retail-forecast
5. Ingress配置(ingress.yaml)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: retail-forecast-ingress
namespace: retail-forecast
annotations:
# Nginx Ingress Controller配置
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/proxy-body-size: "10m"
nginx.ingress.kubernetes.io/proxy-connect-timeout: "30"
nginx.ingress.kubernetes.io/proxy-read-timeout: "120"
nginx.ingress.kubernetes.io/rate-limit: "100" # 限流100请求/秒
spec:
ingressClassName: nginx
tls:
- hosts:
- api.forecast.example.com
secretName: forecast-tls-secret
rules:
- host: api.forecast.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: retail-forecast-api
port:
number: 80
6. 自动伸缩配置(hpa.yaml)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: retail-forecast-hpa
namespace: retail-forecast
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: retail-forecast-api
# 副本数范围
minReplicas: 2
maxReplicas: 10
# 伸缩指标
metrics:
# CPU使用率(目标70%)
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
# 内存使用率(目标80%)
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
# 冷却时间(避免频繁伸缩)
behavior:
scaleDown:
stabilizationWindowSeconds: 300 # 缩容等待5分钟
policies:
- type: Percent
value: 10
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0 # 扩容立即响应
policies:
- type: Percent
value: 100
periodSeconds: 15
五、第三步:使用Helm简化部署
Helm是Kubernetes的包管理器,类似apt/yum,让你一键部署复杂应用。
1. 创建Helm Chart结构
helm create retail-forecast
生成的目录结构:
retail-forecast/
├── Chart.yaml
├── values.yaml
├── templates/
│ ├── deployment.yaml
│ ├── service.yaml
│ ├── ingress.yaml
│ ├── hpa.yaml
│ └── _helpers.tpl
└── .helmignore
2. 优化values.yaml
# values.yaml
replicaCount: 3
image:
repository: ghcr.io/yourusername/retail-forecast
pullPolicy: IfNotPresent
tag: "latest"
imagePullSecrets: []
# - name: ghcr-secret
nameOverride: ""
fullnameOverride: ""
serviceAccount:
create: true
annotations: {}
name: ""
podAnnotations:
prometheus.io/scrape: "true"
prometheus.io/port: "5000"
prometheus.io/path: "/metrics"
service:
type: ClusterIP
port: 80
ingress:
enabled: true
className: "nginx"
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
hosts:
- host: api.forecast.example.com
paths:
- path: /
pathType: Prefix
service:
port: 80
tls:
- secretName: forecast-tls-secret
hosts:
- api.forecast.example.com
resources:
limits:
cpu: 500m
memory: 1Gi
requests:
cpu: 100m
memory: 256Mi
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80
nodeSelector: {}
tolerations: []
affinity: {}
# 监控组件
monitoring:
enabled: true
prometheus:
enabled: true
namespace: monitoring
grafana:
enabled: true
namespace: monitoring
3. 一键部署
# 添加监控仓库
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
# 部署应用
helm install retail-forecast ./retail-forecast -n retail-forecast --create-namespace
# 部署监控(可选)
helm install prometheus prometheus-community/kube-prometheus-stack \
-n monitoring --create-namespace
# 升级(如更新配置)
helm upgrade retail-forecast ./retail-forecast -n retail-forecast
# 回滚
helm rollback retail-forecast -n retail-forecast
六、第四步:GitHub Actions集成K8s部署
更新deploy.yml
# .github/workflows/deploy-k8s.yml
name: Deploy to Kubernetes
on:
push:
branches: [main]
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Login to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=sha,format=long
type=raw,value=latest
- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
- name: Setup Kubeconfig
uses: azure/k8s-set-context@v3
with:
kubeconfig: ${{ secrets.KUBE_CONFIG }}
- name: Deploy to Kubernetes
run: |
# 使用sed替换镜像版本
TAG=$(echo "${{ steps.meta.outputs.tags }}" | cut -d':' -f2)
sed -i "s|image: .*|image: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${TAG}|" k8s/deployment.yaml
# 应用配置
kubectl apply -f k8s/namespace.yaml
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yaml
kubectl apply -f k8s/ingress.yaml
kubectl apply -f k8s/hpa.yaml
# 等待滚动更新完成
kubectl rollout status deployment/retail-forecast-api -n retail-forecast
kubectl get pods -n retail-forecast
七、第五步:生产级监控与告警
PrometheusRule配置
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: retail-forecast-alerts
namespace: retail-forecast
spec:
groups:
- name: retail-forecast
rules:
# Pod CPU使用率过高
- alert: HighCPUUsage
expr: |
sum(rate(container_cpu_usage_seconds_total{
pod=~"retail-forecast.*"}[5m])) by (pod)
/ on(pod) group_left()
kube_pod_container_resource_limits{resource="cpu"}
> 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "Pod CPU使用率超过80%"
description: "Pod {{ $labels.pod }} CPU使用率过高"
# Pod重启过多
- alert: PodRestartingTooMuch
expr: |
sum(kube_pod_container_status_restarts_total{
pod=~"retail-forecast.*"}) by (pod) > 3
for: 5m
labels:
severity: warning
annotations:
summary: "Pod重启次数过多"
description: "Pod {{ $labels.pod }} 在5分钟内重启超过3次"
# HPA达到最大副本数
- alert: HPAAtMaximumReplicas
expr: |
kube_hpa_status_current_replicas{
name="retail-forecast-hpa"}
== kube_hpa_spec_max_replicas{
name="retail-forecast-hpa"}
for: 10m
labels:
severity: critical
annotations:
summary: "HPA达到最大副本数"
description: "HPA已达最大副本数{{ $value }},可能需要扩容"
# 预测延迟过高
- alert: HighPredictionLatency
expr: |
histogram_quantile(0.99,
rate(model_prediction_duration_seconds_bucket[5m])) > 1
for: 5m
labels:
severity: warning
annotations:
summary: "预测延迟过高"
description: "P99延迟超过1秒,当前值: {{ $value }}s"
八、常见问题与避坑指南
Q1:Pod一直处于Pending状态?
检查PVC(持久卷)是否满足,或者节点资源是否足够:kubectl describe pod <pod-name> -n retail-forecast
Q2:滚动更新时服务不可用?
确保readinessProbe配置正确,且preStop钩子有足够延迟让kube-proxy更新。
Q3:HPA不生效?
确保metrics-server已安装并运行:kubectl top pods -n retail-forecast
Q4:如何查看Pod日志?
kubectl logs -f deployment/retail-forecast-api -n retail-forecast
kubectl logs -f <pod-name> -n retail-forecast --previous # 查看重启前的日志
九、性能对比:Docker Compose vs Kubernetes
| 指标 | Docker Compose | Kubernetes |
|---|---|---|
| 部署复杂度 | 简单 | 中等(需要集群) |
| 扩缩容速度 | 分钟级手动 | 秒级自动 |
| 故障恢复 | 依赖健康检查脚本 | 原生支持 |
| 最大并发 | 受单机资源限制 | 理论上无限 |
| 成本 | 低(单台服务器) | 中高(多节点集群) |
| 适用场景 | 开发、小规模生产 | 中大规模生产 |
十、完整代码仓库
仓库地址:https://github.com/yourusername/retail-forecast-k8s
包含内容:
- 完整Kubernetes配置文件
- Helm Chart模板
- GitHub Actions K8s部署流水线
- Prometheus告警规则
- 完整的Grafana Dashboard JSON
下一篇预告:《模型监控进阶:数据漂移检测与自动重训练》——让模型自己"感知"到性能下降
如果你在部署过程中遇到问题,欢迎私信交流!
「往期文章推荐 + 关注我」
更多推荐
所有评论(0)