← 返回首页
🔧

Jaeger分布式追踪:微服务链路追踪

📂 devops ⏱ 3 min 409 words

Jaeger分布式追踪:微服务链路追踪

什么是Jaeger

Jaeger是Uber开源的分布式追踪系统,用于监控和诊断微服务架构中的请求链路。它基于OpenTracing标准,帮助开发人员理解服务间的调用关系、定位性能瓶颈。

架构组件

Jaeger架构:
  ├── Agent: 接收应用上报的追踪数据
  ├── Collector: 处理和存储追踪数据
  ├── Query: 查询和检索追踪数据
  └── UI: 可视化追踪数据

安装Jaeger

Docker一键部署

# All-in-one模式(开发环境)
docker run -d \
  --name jaeger \
  -p 5775:5775/udp \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 14268:14268 \
  -p 9411:9411 \
  jaegertracing/all-in-one:latest

# 访问UI
# http://localhost:16686

生产环境部署

# 使用Docker Compose部署
cat > docker-compose.yml << 'EOF'
version: '3.8'
services:
  cassandra:
    image: cassandra:4.0
    environment:
      - CASSANDRA_CLUSTER_NAME=jaeger
    volumes:
      - cassandra-data:/var/lib/cassandra

  jaeger-collector:
    image: jaegertracing/jaeger-collector:latest
    environment:
      - CASSANDRA_SERVERS=cassandra
      - CASSANDRA_KEYSPACE=jaeger_v1
    ports:
      - "14269:14269"
    depends_on:
      - cassandra

  jaeger-query:
    image: jaegertracing/jaeger-query:latest
    environment:
      - CASSANDRA_SERVERS=cassandra
      - CASSANDRA_KEYSPACE=jaeger_v1
    ports:
      - "16686:16686"
    depends_on:
      - cassandra

  jaeger-agent:
    image: jaegertracing/jaeger-agent:latest
    environment:
      - JAEGER_AGENT_HOST=jaeger-agent
    ports:
      - "6831:6831/udp"
      - "6832:6832/udp"
      - "5775:5775/udp"
EOF

应用集成

Go应用集成

package main

import (
    "github.com/opentracing/opentracing-go"
    "github.com/uber/jaeger-client-go"
    "github.com/uber/jaeger-client-go/config"
)

func initTracer(serviceName string) (opentracing.Tracer, io.Closer, error) {
    cfg := config.Configuration{
        ServiceName: serviceName,
        Sampler: &config.SamplerConfig{
            Type:  "const",
            Param: 1,
        },
        Reporter: &config.ReporterConfig{
            LogSpans:           true,
            LocalAgentHostPort: "jaeger-agent:6831",
        },
    }
    
    return cfg.NewTracer(
        config.Logger(jaeger.NullLogger),
    )
}

func main() {
    tracer, closer, _ := initTracer("my-service")
    defer closer.Close()
    
    opentracing.SetGlobalTracer(tracer)
    
    // 创建根Span
    span := opentracing.StartSpan("main-operation")
    defer span.Finish()
    
    // 添加标签
    span.SetTag("user.id", "12345")
    span.LogFields(
        opentracing.String("event", "request-start"),
    )
    
    // 创建子Span
    childSpan := opentracing.StartSpan(
        "db-query",
        opentracing.ChildOf(span.Context()),
    )
    defer childSpan.Finish()
    
    // 执行数据库操作...
}

Python应用集成

import opentracing
from jaeger_client import Config

def init_tracer(service_name):
    config = Config(
        config={
            'sampler': {
                'type': 'const',
                'param': 1,
            },
            'logging': True,
            'local_agent': {
                'reporting_host': 'jaeger-agent',
                'reporting_port': 6831,
            },
        },
        service_name=service_name,
    )
    return config.initialize_tracer()

tracer = init_tracer('my-python-service')
opentracing.set_global_tracer(tracer)

# 使用装饰器自动追踪
from opentracing_instrumentation import get_traced

@get_traced()
def process_request(request):
    with opentracing.start_span('process-data') as span:
        span.set_tag('request.id', request.id)
        # 处理逻辑
        return result

Node.js应用集成

const { initTracer } = require('jaeger-client');
const opentracing = require('opentracing');

const config = {
  serviceName: 'my-node-service',
  sampler: {
    type: 'const',
    param: 1,
  },
  reporter: {
    logSpans: true,
    agentHost: 'jaeger-agent',
    agentPort: 6831,
  },
};

const options = {
  logger: {
    info(msg) {
      console.log('INFO:', msg);
    },
    error(msg) {
      console.log('ERROR:', msg);
    },
  },
};

const tracer = initTracer(config, options);
opentracing.setGlobalTracer(tracer);

// Express中间件
function tracingMiddleware(req, res, next) {
  const span = tracer.startSpan('http_request');
  span.setTag('http.method', req.method);
  span.setTag('http.url', req.url);
  
  res.on('finish', () => {
    span.setTag('http.status_code', res.statusCode);
    span.finish();
  });
  
  next();
}

Kubernetes部署

# jaeger-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: jaeger
spec:
  replicas: 1
  selector:
    matchLabels:
      app: jaeger
  template:
    metadata:
      labels:
        app: jaeger
    spec:
      containers:
        - name: jaeger
          image: jaegertracing/all-in-one:latest
          ports:
            - containerPort: 16686
            - containerPort: 6831
              protocol: UDP
            - containerPort: 5775
              protocol: UDP
          env:
            - name: CASSANDRA_SERVERS
              value: "cassandra"
            - name: COLLECTOR_ZIPKIN_HOST_PORT
              value: ":9411"

追踪分析

使用Query API

# 查询服务追踪
curl "http://localhost:16686/api/traces?service=my-service&limit=100"

# 按操作名查询
curl "http://localhost:16686/api/traces?service=my-service&operation=GET /api/users"

# 按标签查询
curl "http://localhost:16686/api/traces?service=my-service&tags={\"http.status_code\":\"500\"}"

# 获取特定追踪详情
curl "http://localhost:16686/api/traces/<trace-id>"

分析性能瓶颈

# 查找慢请求
curl "http://localhost:16686/api/traces?service=my-service&minDuration=100ms"

# 查找错误请求
curl "http://localhost:16686/api/traces?service=my-service&tags={\"error\":\"true\"}"

环境变量配置

# Agent配置
JAEGER_AGENT_HOST=jaeger-agent
JAEGER_AGENT_PORT=6831
JAEGER_SERVICE_NAME=my-service
JAEGER_SAMPLER_TYPE=const
JAEGER_SAMPLER_PARAM=1
JAEGER_REPORTER_LOG_SPANS=true
JAEGER_REPORTER_FLUSH_INTERVAL=1s

最佳实践

  1. 采样策略:生产环境使用概率采样,开发环境使用常量采样
  2. Span命名:使用有意义的操作名,避免高基数
  3. 标签使用:添加有价值的元数据标签
  4. 错误处理:记录错误详情和堆栈信息
  5. 性能考虑:异步上报追踪数据,避免阻塞主流程