Distributed Tracing

Configure distributed tracing to track requests across microservices.

Overview

Traces show request flow through your system:

  • Spans: Individual operations (DB query, HTTP call)
  • Parent-child relationships: Call hierarchy
  • Timing: Duration of each operation
  • Attributes: Metadata (user_id, endpoint, status)

Basic Setup

Node.js

const { trace } = require('@opentelemetry/api');
const tracer = trace.getTracer('my-service');

app.get('/api/users/:id', async (req, res) => {
  const span = tracer.startSpan('get_user');

  try {
    const user = await db.query('SELECT * FROM users WHERE id = ?', [req.params.id]);
    span.setAttribute('user.id', user.id);
    span.setAttribute('user.plan', user.plan);
    res.json(user);
  } catch (error) {
    span.recordException(error);
    span.setStatus({ code: SpanStatusCode.ERROR });
    throw error;
  } finally {
    span.end();
  }
});

Python

from opentelemetry import trace

tracer = trace.get_tracer(__name__)

@app.route('/api/users/<user_id>')
def get_user(user_id):
    with tracer.start_as_current_span("get_user") as span:
        span.set_attribute("user.id", user_id)
        user = db.query(f"SELECT * FROM users WHERE id = {user_id}")
        span.set_attribute("user.plan", user['plan'])
        return jsonify(user)

Go

import "go.opentelemetry.io/otel"

func getUser(w http.ResponseWriter, r *http.Request) {
    ctx := r.Context()
    tracer := otel.Tracer("my-service")

    ctx, span := tracer.Start(ctx, "get_user")
    defer span.End()

    userID := r.URL.Query().Get("id")
    span.SetAttributes(attribute.String("user.id", userID))

    user, err := db.QueryContext(ctx, "SELECT * FROM users WHERE id = ?", userID)
    if err != nil {
        span.RecordError(err)
        span.SetStatus(codes.Error, err.Error())
        return
    }

    json.NewEncoder(w).Encode(user)
}

Span Attributes

Add context to spans:

span.setAttribute('http.method', 'POST');
span.setAttribute('http.url', '/api/orders');
span.setAttribute('http.status_code', 200);
span.setAttribute('user.id', '12345');
span.setAttribute('order.id', 'ORD-789');
span.setAttribute('order.total', 49.99);

Nested Spans

Track sub-operations:

const parentSpan = tracer.startSpan('process_order');

// Child span 1
const validateSpan = tracer.startSpan('validate_order', { parent: parentSpan });
validateOrder();
validateSpan.end();

// Child span 2
const chargeSpan = tracer.startSpan('charge_payment', { parent: parentSpan });
await chargePayment();
chargeSpan.end();

parentSpan.end();

Context Propagation

Trace across services:

Service A (Caller)

const { propagation, context } = require('@opentelemetry/api');

const headers = {};
propagation.inject(context.active(), headers);

fetch('http://service-b/api/endpoint', { headers });

Service B (Receiver)

const extractedContext = propagation.extract(context.active(), req.headers);

context.with(extractedContext, () => {
  const span = tracer.startSpan('handle_request');
  // Span is automatically child of Service A's span
});

Sampling

Control which traces to collect:

const { TraceIdRatioBasedSampler } = require('@opentelemetry/sdk-trace-base');

const provider = new TracerProvider({
  sampler: new TraceIdRatioBasedSampler(0.1)  // Sample 10%
});

Viewing Traces

Dashboard β†’ Observability β†’ Apps β†’ [Your App] β†’ Traces

Features:

  • Trace timeline visualization
  • Service dependency graph
  • Error traces highlighted
  • Slow traces filtered
  • Search by span attributes

Best Practices

βœ… DO:

  • Use semantic conventions (http., db., etc.)
  • Add business context (user_id, order_id)
  • Sample high-traffic endpoints
  • Propagate context across services

❌ DON'T:

  • Include sensitive data in attributes
  • Create too many spans (>100 per trace)
  • Forget to end spans
  • Block on span export

Next Steps

  • Logs - Correlate logs with traces
  • Metrics - Track trace metrics
  • SDKs - Language-specific guides