To find a component's binding constraint, rule out the resources operating well within their limits at peak. On a Flickr web server, memory, disk I/O, and network were all comfortably below their limits during peak — leaving **CPU** as the critical resource. Eliminating the healthy ones already tells you something significant about capacity.
Then find a stable relationship between the binding resource and the work done, so one predicts the other. At ~50.7% CPU (45.20 user + 5.50 system) the server ran 46 busy Apache processes; the CPU-to-process ratio held near **1.1** and stayed linear between roughly 40% and 90% CPU. That linearity let Flickr trust CPU as the single defining metric and set the ceiling at **85% CPU** — enough headroom for spikes while still using servers efficiently.
The move generalizes: find the one resource that saturates first, confirm it tracks the workload, then set the red line below the danger zone.
---
*Source: [[The Art of Capacity Planning]] (John Allspaw, O'Reilly 2008) — Ch 3 — Measurement: Units of Capacity*