Create Back Pressure to Slow Producers - Nestor G Pestelos Jr (ngpestelos)

"Every performance problem starts with a queue backing up somewhere": a socket listen queue, the OS run queue, a database I/O queue. An **unbounded** queue consumes all memory, and by Little's law, as its length heads toward infinity so does response time. So queues must be **finite** for response times to be finite. But a bounded, full queue forces a choice when a producer pushes one more item. The only options: pretend to accept it but drop it, accept it and drop something older, refuse it, or **block the producer**. Blocking is flow control — it applies **back pressure** that propagates upstream, throttling the ultimate client until the queue drains. TCP does exactly this with its window: once full, the sender can't send, its transmit buffers fill, and `write()` blocks. Nygard's example: an API server allowed 100 simultaneous calls to its storage engine; the 101st call's thread blocks until a slot frees; that blocking *is* the back pressure, so the server can't outrun the engine. Constraints: back pressure works **within a system boundary** and only when the consumer pool is **finite** (a diverse Internet "upstream" has no systemic throttle). Since it inevitably blocks threads ("a quick path to downtime"), at the edges you need load shedding and async calls instead (accept on one thread pool, call out on another, time out to a 503 or queue to a 202). Distinguish temporary back pressure from a genuinely broken consumer, and alert monitoring when it kicks in. "The only alternative is to let them crash the provider." --- *Source: [[Release It Second Edition]] (Michael T. Nygard, Pragmatic Bookshelf 2018) — Ch 5 — Stability Patterns*