> Every performance problem starts with a queue backing up somewhere. Maybe it's a socket's listen queue. Maybe it's the OS's run queue or the database's I/O queue.
> — Michael T. Nygard, *Release It!* 2nd ed., p. 120
Nygard's diagnostic heuristic: behind every slowdown is a queue filling faster than it drains. It reframes "performance tuning" as **find the backed-up queue** — listen queue, run queue, I/O queue, thread pool, connection pool — rather than chasing symptoms.
Two consequences make it an *outage* story, not just a latency one:
- **Unbounded queues are fatal.** An unbounded queue consumes all available memory, and by **Little's law**, as queue length heads toward infinity, so does response time. Queues must be *finite* for response times to be finite.
- **Slow is indistinguishable from down.** Once response times stretch past callers' timeouts, "to an outside observer, there's no difference between 'really, really slow' and 'down.'" (*Release It!*, p. 119, Shed Load.)
The fix is to bound the queue and decide what happens when it's full — drop, refuse, or **block the producer** to apply [[Create Back Pressure to Slow Producers|back pressure]] upstream. The quote is the entry point: when something is slow, *locate the queue first*.
---
*Source: [[Release It Second Edition]] (Michael T. Nygard, Pragmatic Bookshelf 2018) — Ch 5 — Stability Patterns*