The resilience thesis that drives the entire book. Despite your best-laid plans, **bad things will still happen**.
Preventing the bad events you can foresee is good. But it is "downright fatal" to assume you have predicted and eliminated *all* possible bad events. The real world — crazy users, global traffic, hostile actors from countries you've never heard of — goes well beyond anything you could hope to test for.
So the design goal shifts from prediction to **recovery**: prevent the failures you can, and make sure the system *as a whole* can recover from whatever unanticipated, severe trauma befalls it. This reframing — resilience over prediction, survivability over completeness — is what separates systems designed for production from systems designed to pass QA. The rest of the book operationalizes it, beginning with stability.
---
*Source: [[Release It Second Edition]] (Michael T. Nygard, Pragmatic Bookshelf 2018) — Ch 1 — Living in Production*