A StackSet operation that touches 1000 stack instances across 50 accounts and 4 regions can take hours and create cascading failure modes. CloudFormation gives you four levers to control the rollout: **maximum concurrent accounts**, **failure tolerance**, **region concurrency**, and **concurrency mode**. Plus two safety features: **target account gates** and **parameter overrides**.
## The four core levers
### 1. Maximum concurrent accounts
How many target accounts to operate on at once, **per region**. Number or percentage.
```
Maximum concurrent accounts = 50%, target accounts = 10
→ Up to 5 accounts deploy in parallel per region
```
For percentages, CFN **rounds down** if not a whole number. `25% × 10 = 2.5 → 2`.
### 2. Failure tolerance
Maximum allowed failures **per region** before the operation aborts. Number or percentage.
```
Failure tolerance = 20%, 10 target accounts in 3 regions
→ Up to 2 failures per region tolerated
→ 3rd failure in any region → operation stops in that region
→ Operation continues to next region (Sequential mode)
```
Failure tolerance interacts with Maximum concurrent accounts depending on Concurrency Mode (below).
### 3. Region concurrency
How regions are processed:
- **Sequential** (default) — one region at a time, in the order you specify
- **Parallel** — all regions simultaneously
Sequential limits blast radius — a bad deploy fails in region 1 and stops before reaching region 2. Parallel finishes faster but risks identical failures across all regions before you can cancel.
### 4. Concurrency mode
How the actual concurrency level evolves as failures accumulate:
**Strict Failure Tolerance** (default):
- Initial concurrency = `min(MaxConcurrent, FailureTolerance + 1)`
- Each failure **reduces** active concurrency
- Operation stops when failures = `FailureTolerance + 1`
- Slower but safer — failures naturally throttle the rollout
**Soft Failure Tolerance**:
- Concurrency stays at `MaxConcurrent` regardless of failures
- Decoupled from failure tolerance
- Faster but failures can pile up — by the time you hit `FailureTolerance + 1` failures, the in-flight operations may push the actual count higher
- Useful when: you want maximum throughput AND failures are likely benign (existing-resource collisions, expected permission gaps)
## A worked example — Strict mode at scale
Deploying 1000 stack instances. `FailureTolerance = 100`, `MaxConcurrent = 250`.
- Initial actual concurrency = 101 (min of 250 and 100+1)
- After 50 failures → actual concurrency drops to ~51
- At 101 failures → operation stops
- Final state: ~150-200 stack instances created (101 failed, the rest succeeded before stop)
Soft mode same scenario:
- Actual concurrency = 250 throughout
- Stops when failures > 100, but in-flight ops continue
- Final state: ~300-400 stack instances created (~150 failed, rest succeeded; queue drains)
Soft is roughly 2-3x faster at the cost of more failed resources.
## Parameter overrides — the per-instance variation lever
A StackSet uses one template with one set of parameter values by default. **Parameter overrides** let you set different parameter values per stack instance (per account+region):
```bash
aws cloudformation update-stack-instances \
--stack-set-name my-baseline \
--accounts 111111111111 \
--regions us-east-1 \
--parameter-overrides ParameterKey=Subnets,ParameterValue=subnet-1baa3351
```
This is the **only** way to vary template behavior across stack instances. Use cases:
- VPC IDs that differ per account
- Region-specific AMI IDs
- Account-specific cost-center tags
- Per-environment instance sizes (when one StackSet covers prod+nonprod)
Without overrides, every stack instance gets the same parameter values from the StackSet definition.
## Target account gates — the pre-deployment veto
Account gates are **Lambda functions in target accounts** that CFN invokes before a stack operation. The function returns `SUCCEEDED` (proceed) or `FAILED` (skip this account, count toward failure tolerance).
Strict requirements:
| Requirement | Detail |
|-------------|--------|
| Function name | Must be `AWSCloudFormationStackSetAccountGate` (literal — not configurable) |
| Location | In the target account, in the region being deployed to |
| Permissions | `AWSCloudFormationStackSetExecutionRole` must have `lambda:InvokeFunction` |
| Behavior on missing | If no function with that exact name exists, CFN skips the gate and proceeds |
Use cases:
- Block deployment when active CloudWatch alarms exist (don't deploy during incidents)
- Verify maintenance window
- Check account-specific feature flags
- Enforce compliance preconditions (correct tags exist, certain resources are present)
A failed gate counts toward failure tolerance — so account gates are strict-mode-friendly (they slow rollout proportionally to gate failures).
This feature is **StackSets-only** — not available for normal stack operations.
## What you cannot control
- **Cross-region rollback as a unit** — a failure in region 3 doesn't roll back regions 1 and 2
- **Cross-account rollback as a unit** — same; each stack instance is independent at the rollback level
- **Operation pause** — you can stop, but can't pause-and-resume from a specific point
- **Target ordering within a region** — you specify region order, but accounts within a region are arbitrary
- **Per-account permission scoping** — execution role permissions are uniform across all targets
## Decision heuristics
| If you... | Choose |
|-----------|--------|
| Are deploying to production | Sequential regions + Strict mode + low failure tolerance (1-5%) |
| Are pushing a security patch and need it everywhere fast | Parallel regions + Soft mode + moderate failure tolerance |
| Have stable templates with known-benign edge cases (existing resources) | Soft mode |
| Are testing a new template | Sequential + Strict + low concurrency (5-10) so you can cancel quickly |
| Have any meaningful chance of "this will break X% of accounts" | Strict mode (auto-throttle) |
## Related
- [[CFN StackSets Cross-Account Cross-Region]]
- [[CFN StackSet Permission Models Self-Managed vs Service-Managed]]
- [[CFN Failure Rollback Behavior]]
- [[CFN Drift Detection Mechanics and Limits]]