## Core Insight
Blameless postmortems focus on systemic causes, not individual blame. The fundamental principle: you can't fix people, but you can fix systems and processes to better support people making the right choices. If a culture of finger-pointing prevails, people won't surface issues for fear of punishment — leading to greater organizational risk.
## What a Postmortem Contains
Written record of: incident description, impact, actions taken to mitigate/resolve, root cause(s), and follow-up actions to prevent recurrence.
## When to Write One
Define triggers before incidents occur:
- User-visible downtime/degradation beyond threshold
- Data loss of any kind
- On-call intervention (rollback, rerouting)
- Resolution time above threshold
- Monitoring failure (manual discovery)
- Any stakeholder request
## The Blameless Principle
- Assumes everyone had good intentions and did their best with available information
- Originated in healthcare and avionics where mistakes are fatal
- Shifts from "who did wrong" to "why did this person have incomplete/incorrect information"
- Writing a postmortem is not punishment — it's a learning opportunity
## Cultivating the Culture
- **Postmortem of the month** newsletter
- **Reading clubs** — review old postmortems with open dialogue
- **Wheel of Misfortune** — role-play reenactments of past incidents
- **Visible rewards** — peer bonuses, public recognition from leadership
- **No postmortem left unreviewed** — regular review sessions to close out discussions
- Survey teams on effectiveness; iterate on the process itself
## Review Criteria
- Incident data collected for posterity?
- Impact assessments complete?
- Root cause sufficiently deep?
- Action plan appropriate with proper priority?
- Outcome shared with relevant stakeholders?
## Source
- [[Site Reliability Engineering - Chapter 15 - Postmortem Culture|SRE Ch 15: Postmortem Culture]] by John Lunney and Sue Lueder
## Related Concepts
- [[Incident Command System for SRE]]
- [[Proactive Failure Testing Culture]]
- [[Hypothetico-Deductive Troubleshooting Method]]
- [[Learning from Spectacular Failures - Kpaxs 20250404]] — Aviation's public failures forced rigorous learning