← 返回首页
🚀

金丝雀发布:灰度发布与自动回滚架构

📂 architecture ⏱ 2 min 215 words

金丝雀发布:灰度发布与自动回滚架构

金丝雀发布原理

金丝雀发布将新版本逐步推送给少量用户,观察关键指标(错误率、延迟等),确认无异常后全量发布。源自煤矿金丝雀预警的思路。

                  ┌─────────────────┐
                  │    入口流量      │
                  └────────┬────────┘
                           │
              ┌────────────┴────────────┐
              │                         │
              ▼ 95%                     ▼ 5%(金丝雀)
    ┌─────────────────┐       ┌─────────────────┐
    │   当前版本 v1.0  │       │   新版本 v1.1    │
    │   (Stable)      │       │   (Canary)       │
    └────────┬────────┘       └────────┬────────┘
             │                         │
             └────────────┬────────────┘
                          │
                    指标监控验证
                          │
              ┌───────────┴───────────┐
              │                       │
              ▼ 指标正常               ▼ 指标异常
        全量发布 v1.1              自动回滚到 v1.0

Istio金丝雀发布配置

VirtualService流量分配

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: myapp
spec:
  hosts:
  - myapp
  http:
  - route:
    - destination:
        host: myapp
        subset: stable
      weight: 90
    - destination:
        host: myapp
        subset: canary
      weight: 10
  timeout: 10s
  retries:
    attempts: 3
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: myapp
spec:
  host: myapp
  subsets:
  - name: stable
    labels:
      version: v1.0
  - name: canary
    labels:
      version: v1.1

渐进式流量迁移

# canary-rollout.sh
#!/bin/bash
STEPS=(95 80 50 20 0)  # 旧版本权重,新版本从5%开始

for weight in "${STEPS[@]}"; do
  NEW_WEIGHT=$((100 - weight))
  echo "设置金丝雀权重: ${NEW_WEIGHT}%"
  
  cat <<EOF | kubectl apply -f -
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: myapp
spec:
  hosts:
  - myapp
  http:
  - route:
    - destination:
        host: myapp
        subset: stable
      weight: ${weight}
    - destination:
        host: myapp
        subset: canary
      weight: ${NEW_WEIGHT}
EOF

  # 等待并检查指标
  sleep 300
  ERROR_RATE=$(prometheus_query 'rate(http_requests_total{status=~"5.."}[5m])')
  
  if (( $(echo "$ERROR_RATE > 0.01" | bc -l) )); then
    echo "错误率过高,触发回滚"
    rollback
    exit 1
  fi
done

echo "金丝雀发布完成,全量切换到新版本"

自动化回滚机制

# 基于Prometheus告警的自动回滚
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: canary-monitor
spec:
  configPatches:
  - applyTo: HTTP_FILTER
    match:
      context: SIDECAR_INBOUND
    patch:
      operation: INSERT_BEFORE
      value:
        name: envoy.filters.http.fault
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.http.fault.v3.HTTPFault

最佳实践

  1. 小流量开始:从1-5%流量开始,逐步增加
  2. 关键指标监控:错误率、延迟P99、CPU/内存使用率
  3. 快速回滚能力:确保能在秒级完成回滚
  4. 用户分群:按用户ID、地域、设备等维度灰度