可观测性:日志、指标与追踪
可观测性:日志、指标与追踪
概述
可观测性是理解系统内部状态的能力。本教程介绍日志、指标和分布式追踪。
1. 日志
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.slf4j.MDC;
@Component
public class LoggingService {
private static final Logger logger = LoggerFactory.getLogger(LoggingService.class);
public void processRequest(String requestId) {
MDC.put("requestId", requestId);
try {
logger.info("开始处理请求");
// 业务逻辑
logger.info("请求处理完成");
} catch (Exception e) {
logger.error("请求处理失败", e);
} finally {
MDC.clear();
}
}
}
2. 指标
import io.micrometer.core.instrument.*;
import org.springframework.stereotype.Component;
@Component
public class MetricsService {
private final Counter requestCounter;
private final Timer requestTimer;
private final Gauge activeUsers;
public MetricsService(MeterRegistry registry) {
this.requestCounter = Counter.builder("http.requests.total")
.description("Total HTTP requests")
.register(registry);
this.requestTimer = Timer.builder("http.request.duration")
.description("HTTP request duration")
.register(registry);
this.activeUsers = Gauge.builder("app.active.users", this, MetricsService::getActiveUserCount)
.description("Active users count")
.register(registry);
}
public void recordRequest() {
requestCounter.increment();
}
public void recordDuration(long durationMs) {
requestTimer.record(Duration.ofMillis(durationMs));
}
private double getActiveUserCount() {
return activeUserService.getActiveCount();
}
}
3. 分布式追踪
import brave.Tracing;
import brave.sampler.Sampler;
import zipkin2.reporter.AsyncReporter;
import zipkin2.reporter.okhttp3.OkHttpSender;
@Configuration
public class TracingConfig {
@Bean
public Tracing tracing() {
OkHttpSender sender = OkHttpSender.create("http://localhost:9411/api/v2/spans");
AsyncReporter reporter = AsyncReporter.builder(sender).build();
return Tracing.newBuilder()
.localServiceName("my-service")
.spanReporter(reporter)
.sampler(Sampler.ALWAYS_SAMPLE)
.build();
}
}
@Component
public class TracingService {
private final Tracer tracer;
public void processRequest() {
Span span = tracer.newTrace().name("processRequest").start();
try (Tracer.SpanInScope ws = tracer.withSpanInScope(span)) {
// 业务逻辑
span.annotate("processing");
} catch (Exception e) {
span.error(e);
} finally {
span.finish();
}
}
}
4. 实际应用示例
健康检查
@Component
public class HealthIndicator implements org.springframework.boot.actuate.health.HealthIndicator {
@Override
public Health health() {
if (isDatabaseConnected() && isRedisConnected()) {
return Health.up().withDetail("database", "connected").build();
}
return Health.down().withDetail("database", "disconnected").build();
}
}
Prometheus指标
management:
endpoints:
web:
exposure:
include: health,metrics,prometheus
metrics:
export:
prometheus:
enabled: true
5. 最佳实践
- 结构化日志:使用JSON格式记录日志
- 统一指标命名:遵循Prometheus命名规范
- 采样策略:合理设置追踪采样率
- 上下文传播:在服务间传播追踪上下文
- 监控告警:设置合理的告警阈值
总结
可观测性是理解系统内部状态的能力。掌握日志、指标和分布式追踪,可以构建可维护、可诊断的系统。