📊

Prometheus架构：TSDB与AlertManager设计

📂 architecture ⏱ 2 min 288 words

Prometheus架构：TSDB与AlertManager设计

Prometheus架构概览

Prometheus是CNCF毕业的开源监控系统，采用Pull模型采集指标，内置时序数据库，支持强大的查询语言和灵活的告警机制。

Prometheus架构：
┌─────────────────────────────────────────────────┐
│                  Prometheus Server               │
├─────────────────┬───────────────────────────────┤
│  Retrieval       │  TSDB (时序数据库)            │
│  指标采集        │  本地存储                     │
├─────────────────┴───────────────────────────────┤
│              Service Discovery                  │
│         K8s / Consul / DNS / 文件                │
└──────────────────────┬──────────────────────────┘
                       │
        ┌──────────────┼──────────────┐
        ▼              ▼              ▼
   ┌─────────┐   ┌─────────┐   ┌─────────┐
   │ Exporter │   │ Exporter │   │ App     │
   │ Node     │   │ MySQL   │   │ /metrics│
   └─────────┘   └─────────┘   └─────────┘

TSDB存储架构

数据模型

时间序列 = 指标名 + 标签集 + 时间戳 + 值

示例：
http_requests_total{method="GET", endpoint="/api", status="200"} @1705312200 = 1234

存储结构

TSDB存储结构：
├── WAL（Write-Ahead Log）
│   └── 写入先写WAL，保证数据持久性
├── Head Block（内存块）
│   └── 最近2小时的数据
├── Persisted Block（持久化块）
│   ├── 每2小时压缩一次
│   ├── 包含chunks、index、meta
│   └── 支持压缩和删除
└── Block索引
    └── 基于标签的倒排索引

PromQL查询语言

基础查询

# 瞬时查询：当前值
http_requests_total{method="GET"}

# 范围查询：时间序列
http_requests_total{method="GET"}[5m]

# 聚合查询
sum(rate(http_requests_total[5m])) by (status)

常用函数

# 速率计算（QPS）
rate(http_requests_total[5m])

# 5分钟内的错误率
sum(rate(http_requests_total{status=~"5.."}[5m]))
/
sum(rate(http_requests_total[5m]))

# P99延迟
histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))

# 预测未来4小时的磁盘使用
predict_linear(node_filesystem_avail_bytes[6h], 4*3600)

AlertManager告警

告警规则

# prometheus-rules.yml
groups:
  - name: application-alerts
    rules:
      - alert: HighErrorRate
        expr: |
          sum(rate(http_requests_total{status=~"5.."}[5m])) by (service)
          /
          sum(rate(http_requests_total[5m])) by (service)
          > 0.05
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "服务 {{ $labels.service }} 错误率超过5%"
          description: "当前错误率 {{ $value | humanizePercentage }}"
      
      - alert: HighLatency
        expr: |
          histogram_quantile(0.99, 
            rate(http_request_duration_seconds_bucket[5m])
          ) > 1
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "P99延迟超过1秒"

AlertManager配置

# alertmanager.yml
global:
  smtp_smarthost: 'smtp.example.com:587'
  smtp_from: 'alertmanager@example.com'
  smtp_auth_username: 'alertmanager'
  smtp_auth_password: 'password'

route:
  group_by: ['alertname', 'service']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 1h
  receiver: 'default'
  routes:
    - match:
        severity: critical
      receiver: 'pager'
      repeat_interval: 5m
    - match:
        severity: warning
      receiver: 'slack'

receivers:
  - name: 'default'
    email_configs:
      - to: 'team@example.com'
  
  - name: 'pager'
    pagerduty_configs:
      - service_key: '<key>'
  
  - name: 'slack'
    slack_configs:
      - channel: '#alerts'
        send_resolved: true
        title: '{{ .GroupLabels.alertname }}'
        text: '{{ range .Alerts }}{{ .Annotations.description }}{{ end }}'

inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'instance']

高可用部署

# Prometheus高可用架构
high_availability:
  replication:
    - 两个Prometheus实例抓取相同targets
    - 使用Thanos或VictoriaMetrics实现长期存储
  
  federation:
    - 聚合多个Prometheus的数据
    - 用于全局视图
  
  remote_write:
    - 写入远程存储（Thanos/Cortex）
    - 实现数据持久化和跨集群查询

最佳实践

标签规范：使用有意义的标签，避免高基数标签
录制规则：预计算常用查询，减少查询时计算
存储规划：根据指标数量和保留期规划磁盘空间
联邦架构：大规模部署使用联邦聚合多层数据