← 返回首页
📊

微服务监控与可观测性

📂 java ⏱ 1 min 200 words

微服务监控与可观测性

概述

监控和可观测性是微服务架构的重要组成部分。本教程介绍日志、指标和追踪的实现。

1. Spring Boot Actuator

# application.yml
management:
  endpoints:
    web:
      exposure:
        include: health,info,metrics,prometheus
  endpoint:
    health:
      show-details: always
  metrics:
    export:
      prometheus:
        enabled: true

2. Prometheus指标

import io.micrometer.core.annotation.Timed;
import io.micrometer.core.instrument.Counter;
import io.micrometer.core.instrument.MeterRegistry;
import org.springframework.stereotype.Service;

@Service
public class MetricsService {
    private final Counter requestCounter;
    private final Counter errorCounter;
    
    public MetricsService(MeterRegistry registry) {
        this.requestCounter = Counter.builder("http.requests.total")
            .description("Total HTTP requests")
            .tag("service", "user-service")
            .register(registry);
        
        this.errorCounter = Counter.builder("http.errors.total")
            .description("Total HTTP errors")
            .tag("service", "user-service")
            .register(registry);
    }
    
    @Timed(value = "user.service.get", description = "Time taken to get user")
    public User getUser(Long id) {
        requestCounter.increment();
        // 业务逻辑
    }
}

3. 实际应用示例

分布式追踪

import brave.Tracing;
import brave.sampler.Sampler;
import zipkin2.reporter.AsyncReporter;
import zipkin2.reporter.okhttp3.OkHttpSender;

@Configuration
public class TracingConfig {
    @Bean
    public Tracing tracing() {
        OkHttpSender sender = OkHttpSender.create("http://localhost:9411/api/v2/spans");
        AsyncReporter reporter = AsyncReporter.builder(sender).build();
        
        return Tracing.newBuilder()
            .localServiceName("user-service")
            .spanReporter(reporter)
            .sampler(Sampler.ALWAYS_SAMPLE)
            .build();
    }
}

告警配置

# prometheus.yml
groups:
- name: java-apps
  rules:
  - alert: HighErrorRate
    expr: rate(http_errors_total{service="user-service"}[5m]) > 0.1
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: "High error rate detected"
      description: "Error rate is {{ $value }} per second"
  
  - alert: HighLatency
    expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) > 1
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High latency detected"
      description: "95th percentile latency is {{ $value }} seconds"

4. 最佳实践

  1. 使用标准指标:遵循Prometheus命名规范
  2. 设置告警规则:及时发现问题
  3. 分布式追踪:使用Jaeger或Zipkin
  4. 日志聚合:使用ELK或Loki
  5. 可视化监控:使用Grafana仪表板

总结

监控和可观测性是微服务架构的重要组成部分。掌握日志、指标和追踪的实现,可以构建可维护、可靠的微服务系统。