Your First Incident

By the end of this guide, you’ll understand what happens when a monitor fails, how to investigate an incident, and how to resolve it.

Prerequisites

At least one monitor running — see First HTTP monitor
Familiarity with the Dashboard or CLI

When an incident appears

An incident appears in your Dashboard when a monitor’s check failures match its incident policy. Here’s the typical flow:

Checks start failing

A monitor’s check returns an error or fails an assertion. DevHelm starts watching the situation.

Trigger rule matched

After enough consecutive failures (default: 2), the trigger rule fires. The incident enters TRIGGERED status.

Multi-region confirmation

If the monitor runs from multiple regions, DevHelm waits for confirmation from additional regions. This reduces false positives from single-region network issues.

Incident confirmed

The incident moves to CONFIRMED. Alerts fire through your notification policies.

Investigate the incident

View incident details

devhelm incidents list
devhelm incidents get <incident-id>

The incident detail shows:

Severity — DOWN, DEGRADED, or MAINTENANCE
Affected regions — which probe regions are seeing failures
Timeline — every status change and update since detection
Trigger rule — which rule fired and why
Duration — how long the incident has been active

Check the timeline

The incident timeline shows exactly what happened and when:

devhelm incidents get <incident-id>

Look at the updates array — each entry records a status change, user update, or system event with a timestamp.

View failing check results

Drill into the monitor’s recent check results to understand what’s failing:

devhelm monitors checks <monitor-id> --limit 10

Look for assertion failures, HTTP error codes, timeouts, or connection errors.

Resolve the incident

Automatic resolution

Most incidents resolve automatically. When the monitor starts passing again:

Consecutive passing checks accumulate (default: 2 required)
Enough regions must be healthy (default: 2)
The incident moves to RESOLVED
A cooldown period prevents immediate reopening (default: 5 minutes)

Manual resolution

If you’ve fixed the issue and don’t want to wait for automatic recovery:

devhelm incidents resolve <incident-id> \
  --body "Deployed hotfix to restore API endpoint"

Add context

Post updates during investigation to keep your team informed:

devhelm incidents update <incident-id> \
  --body "Investigating — seeing 503s from upstream dependency" \
  --notify

Key concepts to know

Concept	What to know
Incident policy	Controls when incidents open (trigger rules) and close (recovery policy)
Confirmation	Multi-region validation that reduces false positives
Cooldown	Quiet period after resolution that prevents flapping
Reopening	If the monitor fails again after cooldown, the same incident reopens

Next steps

Incidents overview

Full lifecycle, statuses, severities, and sources.

Incident policies

Customize trigger rules and recovery behavior.

First alert

Get notified when incidents happen.

Incidents guide

Day-to-day incident management workflows.

Documentation Index

​When an incident appears

​Investigate the incident

​View incident details

​Check the timeline

​View failing check results

​Resolve the incident

​Automatic resolution

​Manual resolution

​Add context

​Key concepts to know

​Next steps