Distributed Tracing
Configure distributed tracing to track requests across microservices.
Overview
Traces show request flow through your system:
- Spans: Individual operations (DB query, HTTP call)
- Parent-child relationships: Call hierarchy
- Timing: Duration of each operation
- Attributes: Metadata (user_id, endpoint, status)
Basic Setup
Node.js
const { trace } = require('@opentelemetry/api');
const tracer = trace.getTracer('my-service');
app.get('/api/users/:id', async (req, res) => {
const span = tracer.startSpan('get_user');
try {
const user = await db.query('SELECT * FROM users WHERE id = ?', [req.params.id]);
span.setAttribute('user.id', user.id);
span.setAttribute('user.plan', user.plan);
res.json(user);
} catch (error) {
span.recordException(error);
span.setStatus({ code: SpanStatusCode.ERROR });
throw error;
} finally {
span.end();
}
});
Python
from opentelemetry import trace
tracer = trace.get_tracer(__name__)
@app.route('/api/users/<user_id>')
def get_user(user_id):
with tracer.start_as_current_span("get_user") as span:
span.set_attribute("user.id", user_id)
user = db.query(f"SELECT * FROM users WHERE id = {user_id}")
span.set_attribute("user.plan", user['plan'])
return jsonify(user)
Go
import "go.opentelemetry.io/otel"
func getUser(w http.ResponseWriter, r *http.Request) {
ctx := r.Context()
tracer := otel.Tracer("my-service")
ctx, span := tracer.Start(ctx, "get_user")
defer span.End()
userID := r.URL.Query().Get("id")
span.SetAttributes(attribute.String("user.id", userID))
user, err := db.QueryContext(ctx, "SELECT * FROM users WHERE id = ?", userID)
if err != nil {
span.RecordError(err)
span.SetStatus(codes.Error, err.Error())
return
}
json.NewEncoder(w).Encode(user)
}
Span Attributes
Add context to spans:
span.setAttribute('http.method', 'POST');
span.setAttribute('http.url', '/api/orders');
span.setAttribute('http.status_code', 200);
span.setAttribute('user.id', '12345');
span.setAttribute('order.id', 'ORD-789');
span.setAttribute('order.total', 49.99);
Nested Spans
Track sub-operations:
const parentSpan = tracer.startSpan('process_order');
// Child span 1
const validateSpan = tracer.startSpan('validate_order', { parent: parentSpan });
validateOrder();
validateSpan.end();
// Child span 2
const chargeSpan = tracer.startSpan('charge_payment', { parent: parentSpan });
await chargePayment();
chargeSpan.end();
parentSpan.end();
Context Propagation
Trace across services:
Service A (Caller)
const { propagation, context } = require('@opentelemetry/api');
const headers = {};
propagation.inject(context.active(), headers);
fetch('http://service-b/api/endpoint', { headers });
Service B (Receiver)
const extractedContext = propagation.extract(context.active(), req.headers);
context.with(extractedContext, () => {
const span = tracer.startSpan('handle_request');
// Span is automatically child of Service A's span
});
Sampling
Control which traces to collect:
const { TraceIdRatioBasedSampler } = require('@opentelemetry/sdk-trace-base');
const provider = new TracerProvider({
sampler: new TraceIdRatioBasedSampler(0.1) // Sample 10%
});
Viewing Traces
Dashboard β Observability β Apps β [Your App] β Traces
Features:
- Trace timeline visualization
- Service dependency graph
- Error traces highlighted
- Slow traces filtered
- Search by span attributes
Best Practices
β DO:
- Use semantic conventions (http., db., etc.)
- Add business context (user_id, order_id)
- Sample high-traffic endpoints
- Propagate context across services
β DON'T:
- Include sensitive data in attributes
- Create too many spans (>100 per trace)
- Forget to end spans
- Block on span export