← 返回首页
📊

Prometheus监控系统

📂 devops ⏱ 2 min 257 words

Prometheus监控系统

Prometheus简介

Prometheus是一个开源的监控和告警系统,具有多维数据模型和强大的查询语言。

架构

Prometheus Server
├── 数据采集 (Pull)
├── 时序数据库存储
├── PromQL查询
└── 告警规则

Exporters:
├── node-exporter (系统指标)
├── mysql-exporter (MySQL指标)
├── nginx-exporter (Nginx指标)
└── 自定义exporter

安装Prometheus

Docker部署

docker run -d --name prometheus \
    -p 9090:9090 \
    -v prometheus.yml:/etc/prometheus/prometheus.yml \
    prom/prometheus

Docker Compose

version: '3.8'

services:
  prometheus:
    image: prom/prometheus
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus

  grafana:
    image: grafana/grafana
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    volumes:
      - grafana_data:/var/lib/grafana

  node-exporter:
    image: prom/node-exporter
    ports:
      - "9100:9100"
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
    command:
      - '--path.procfs=/host/proc'
      - '--path.sysfs=/host/sys'

volumes:
  prometheus_data:
  grafana_data:

Prometheus配置

# prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s

rule_files:
  - "rules/*.yml"

alerting:
  alertmanagers:
    - static_configs:
        - targets: ['alertmanager:9093']

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']
  
  - job_name: 'node'
    static_configs:
      - targets: ['node-exporter:9100']
  
  - job_name: 'mysql'
    static_configs:
      - targets: ['mysql-exporter:9104']

告警规则

# rules/alerts.yml
groups:
  - name: system
    rules:
      - alert: HighCPUUsage
        expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High CPU usage on {{ $labels.instance }}"
          description: "CPU usage is above 80% for 5 minutes"
      
      - alert: HighMemoryUsage
        expr: (1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100 > 80
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High memory usage on {{ $labels.instance }}"
      
      - alert: DiskSpaceLow
        expr: (1 - node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100 > 85
        for: 10m
        labels:
          severity: critical
        annotations:
          summary: "Low disk space on {{ $labels.instance }}"

PromQL查询

# CPU使用率
100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

# 内存使用率
(1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100

# 磁盘使用率
(1 - node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100

# 网络流量
irate(node_network_receive_bytes_total[5m]) * 8

Grafana仪表板

# 导入仪表板
# Node Exporter Full: ID 1860
# MySQL Overview: ID 7362
# Nginx: ID 12708

实践:完整监控系统

# 1. 启动监控系统
docker-compose up -d

# 2. 访问Prometheus
# http://localhost:9090

# 3. 访问Grafana
# http://localhost:3000
# 用户名: admin, 密码: admin

# 4. 添加数据源
# Prometheus: http://prometheus:9090

总结

Prometheus是云原生时代的标准监控系统。通过配置Prometheus、Grafana和告警规则,可以实现全面的系统监控。