Every Performance Problem Starts With a Queue - Nestor G Pestelos Jr (ngpestelos)

> Every performance problem starts with a queue backing up somewhere. Maybe it's a socket's listen queue. Maybe it's the OS's run queue or the database's I/O queue. > — Michael T. Nygard, *Release It!* 2nd ed., p. 120 Nygard's diagnostic heuristic: behind every slowdown is a queue filling faster than it drains. It reframes "performance tuning" as **find the backed-up queue** — listen queue, run queue, I/O queue, thread pool, connection pool — rather than chasing symptoms. Two consequences make it an *outage* story, not just a latency one: - **Unbounded queues are fatal.** An unbounded queue consumes all available memory, and by **Little's law**, as queue length heads toward infinity, so does response time. Queues must be *finite* for response times to be finite. - **Slow is indistinguishable from down.** Once response times stretch past callers' timeouts, "to an outside observer, there's no difference between 'really, really slow' and 'down.'" (*Release It!*, p. 119, Shed Load.) The fix is to bound the queue and decide what happens when it's full — drop, refuse, or **block the producer** to apply [[Create Back Pressure to Slow Producers|back pressure]] upstream. The quote is the entry point: when something is slow, *locate the queue first*. --- *Source: [[Release It Second Edition]] (Michael T. Nygard, Pragmatic Bookshelf 2018) — Ch 5 — Stability Patterns*