Every monitor has an incident policy that controls when incidents open, how they’re confirmed, and when they auto-resolve. A policy has three components: trigger rules, a confirmation policy, and a recovery policy.Documentation Index
Fetch the complete documentation index at: https://docs.devhelm.io/llms.txt
Use this file to discover all available pages before exploring further.
Trigger rules
Trigger rules define the conditions that open an incident from check results. Each monitor can have multiple rules at different severities.Rule types
| Type | Behavior | Required fields |
|---|---|---|
consecutive_failures | Opens an incident after N consecutive failed checks | count |
failures_in_window | Opens an incident after N failures within a time window | count, windowMinutes |
response_time | Opens an incident when response time exceeds a threshold | thresholdMs, aggregationType |
Scope
Each rule has a scope that determines how regions are evaluated:| Scope | Behavior |
|---|---|
per_region | Each region is evaluated independently — the rule must be satisfied in a single region |
any_region | Failures are aggregated across all regions |
Severity
Each rule targets a severity level. When multiple rules fire, the highest severity wins:| Severity | Priority |
|---|---|
down | Highest — complete failure |
degraded | Lower — partial failure or performance issue |
Response time aggregation
Forresponse_time rules, the aggregationType field controls how latency is evaluated across checks:
| Aggregation | Behavior |
|---|---|
all_exceed | Every check in the evaluation window must exceed the threshold |
average | The average response time exceeds the threshold |
p95 | The 95th percentile exceeds the threshold |
max | The maximum response time exceeds the threshold |
Default policy
When you create a monitor without specifying a policy, DevHelm applies a sensible default:- Trigger: 2 consecutive failures per region → severity
down - Confirmation: Multi-region, 1 region failing, wait up to
max(60, frequency × 2)seconds - Recovery: 2 consecutive successes, 2 regions passing, 5-minute cooldown
Example
A policy with two trigger rules — one for complete failures and one for performance degradation:Confirmation
Confirmation prevents false positives by requiring failures from multiple probe regions before promoting an incident toCONFIRMED status.
| Field | Type | Description |
|---|---|---|
type | string | Confirmation strategy — currently multi_region |
minRegionsFailing | integer | Minimum regions that must be failing to confirm |
maxWaitSeconds | integer | Maximum seconds to wait for enough regions to report failures |
maxWaitSeconds for at least minRegionsFailing regions to also report failures. If enough regions confirm within the window, the incident moves to CONFIRMED and alerts fire. If the window expires without enough regions failing, the incident is discarded.
Set
minRegionsFailing to 1 to confirm on the first region that reports a failure. This is useful for monitors running from a single region.Recovery
Recovery controls when a confirmed incident auto-resolves.| Field | Type | Description |
|---|---|---|
consecutiveSuccesses | integer | Number of consecutive passing checks required before resolving |
minRegionsPassing | integer | Minimum regions that must be healthy before recovery completes |
cooldownMinutes | integer | Minutes after resolution before a new incident can open (0–60) |
RESOLVED. The cooldown period then suppresses new incidents for the same monitor, preventing flapping.
Managing policies
View a monitor’s policy
Update a policy
Next steps
Incidents overview
Understand the full incident lifecycle and statuses.
Monitoring regions
Learn how multi-region checks interact with confirmation policies.
Alerting overview
Configure notifications for confirmed incidents.
Maintenance windows
Suppress incidents during planned downtime.