← 返回首页

生产就绪检查清单

📂 devops ⏱ 2 min 335 words

生产就绪检查清单

检查清单

可靠性

安全性

可观测性

性能

运维

Kubernetes检查

# 生产就绪Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  template:
    metadata:
      labels:
        app: myapp
    spec:
      serviceAccountName: myapp
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
      containers:
        - name: myapp
          image: myapp:v1
          ports:
            - containerPort: 8080
          envFrom:
            - configMapRef:
                name: myapp-config
            - secretRef:
                name: myapp-secrets
          resources:
            requests:
              cpu: 100m
              memory: 128Mi
            limits:
              cpu: 500m
              memory: 512Mi
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
          startupProbe:
            httpGet:
              path: /healthz
              port: 8080
            failureThreshold: 30
            periodSeconds: 10

自动化检查脚本

#!/bin/bash

echo "=== 生产就绪检查 ==="

# 1. 检查健康检查
echo "1. 检查健康检查..."
if kubectl get deployment myapp -o jsonpath='{.spec.template.spec.containers[0].livenessProbe}' | grep -q "httpGet"; then
    echo "   ✓ Liveness probe configured"
else
    echo "   ✗ Liveness probe missing"
fi

# 2. 检查资源限制
echo "2. 检查资源限制..."
if kubectl get deployment myapp -o jsonpath='{.spec.template.spec.containers[0].resources.limits}' | grep -q "cpu"; then
    echo "   ✓ Resource limits configured"
else
    echo "   ✗ Resource limits missing"
fi

# 3. 检查网络策略
echo "3. 检查网络策略..."
if kubectl get networkpolicy -n production | grep -q "myapp"; then
    echo "   ✓ Network policy configured"
else
    echo "   ✗ Network policy missing"
fi

# 4. 检查HPA
echo "4. 检查HPA..."
if kubectl get hpa -n production | grep -q "myapp"; then
    echo "   ✓ HPA configured"
else
    echo "   ✗ HPA missing"
fi

# 5. 检查PDB
echo "5. 检查PDB..."
if kubectl get pdb -n production | grep -q "myapp"; then
    echo "   ✓ PDB configured"
else
    echo "   ✗ PDB missing"
fi

上线流程

# 上线流程定义
steps:
  - name: 代码审查
    owner: development
    checklist:
      - 代码审查完成
      - 单元测试通过
      - 集成测试通过
  
  - name: 安全审查
    owner: security
    checklist:
      - 依赖漏洞扫描
      - 镜像安全扫描
      - 安全配置审查
  
  - name: 性能测试
    owner: sre
    checklist:
      - 负载测试完成
      - 性能基线建立
      - 容量评估完成
  
  - name: 部署
    owner: sre
    checklist:
      - 部署脚本测试
      - 回滚方案准备
      - 监控告警配置
  
  - name: 验证
    owner: sre
    checklist:
      - 功能验证
      - 性能验证
      - 监控验证

最佳实践

  1. 自动化检查
  2. 渐进式上线
  3. 监控先行
  4. 准备回滚
  5. 文档完善

总结

生产就绪检查是确保应用稳定上线的关键。通过系统性的检查清单和自动化工具,可以有效降低上线风险。