hardDevOps EngineerTechnology
What is observability, and how do logging, metrics, and tracing work together in a production system?
Posted 18/04/2026
by Mehedy Hasan Ador
Question Details
At a microservices company:
> "Users report intermittent slowness. Our logs show no errors. How do we find the bottleneck across 15 microservices?"
> "Users report intermittent slowness. Our logs show no errors. How do we find the bottleneck across 15 microservices?"
Suggested Solution
The Three Pillars of Observability
1. Logging (What happened?)
// Structured logging
logger.info({
event: "ordercreated",
orderId: "123",
userId: "456",
duration: 230,
service: "order-service",
});
// Search: "Show me all ordercreated events for user 456"
2. Metrics (How much/how many?)
Prometheus metrics
httprequeststotal{method="GET", path="/api/orders", status="200"} 15423
httprequestduration_seconds{quantile="0.99"} 0.234
Alert: "p99 latency > 500ms for 5 minutes"
3. Tracing (Where did time go?)
Request → [API Gateway: 2ms]
→ [Auth Service: 15ms] ← SLOW!
→ [Order Service: 5ms]
→ [Database: 180ms] ← BOTTLENECK!
Total: 202ms
Trace ID: abc-123 links all services together
Distributed Tracing Setup
import { trace } from "@opentelemetry/api";
const tracer = trace.getTracer("order-service");
async function createOrder(data) {
const span = tracer.startSpan("createOrder");
span.setAttribute("userId", data.userId);
const authResult = await tracer.startActiveSpan("verifyAuth", async (span) => {
return verifyAuth(data.token);
});
const order = await tracer.startActiveSpan("db.insert", async (span) => {
return prisma.order.create({ data });
});
span.end();
return order;
}