When people configure a "daily cron job," they commonly choose midnight. In a distributed system with thousands of teams, this creates massive simultaneous load spikes — 30+ MapReduce jobs spawning thousands of workers at the same instant.
## Google's Fix: Hash-Based Time Distribution
Extended crontab with `?` wildcard: "any value is acceptable, system chooses." The system hashes the job configuration over the time range (e.g., 0-23 for hours) to distribute launches evenly. Users opt in by replacing specific times with `?`.
Despite this, load remains spiky because some jobs have hard temporal dependencies (e.g., must run after a daily data export completes at a specific time).
## The Deeper Pattern
Synchronized behavior in distributed systems creates correlated load spikes. The same problem appears in:
- **Cache stampede**: Many clients discover cache expiry simultaneously and all hit the backend
- **Fleet restarts**: Rolling restart with insufficient jitter causes periodic capacity dips
- **Retry storms**: Exponential backoff without jitter causes synchronized retry waves
- **Auto-scaling**: All instances scaling simultaneously based on the same metric threshold
The universal fix: add randomization (jitter) to break synchronization. Google's `?` wildcard is jitter for scheduling.