Monitors in YAML

The monitors section of devhelm.yml defines your monitoring checks. Each monitor specifies a type, configuration, assertions, and scheduling.

Monitor fields

Field	Type	Required	Description
`name`	string	Yes	Unique monitor name
`type`	string	Yes	`HTTP`, `DNS`, `TCP`, `ICMP`, `HEARTBEAT`, `MCP_SERVER`
`config`	object	Yes	Type-specific configuration (see below)
`frequency`	integer	—	Check frequency in seconds (30–86400)
`enabled`	boolean	—	Whether the monitor runs checks (default: `true`)
`regions`	string[]	—	Probe regions (e.g., `us-east`, `eu-west`)
`environment`	string	—	Environment slug for variable substitution
`tags`	string[]	—	Tag names to attach
`alertChannels`	string[]	—	Alert channel names to notify on incidents
`assertions`	object[]	—	Pass/fail criteria (see below)
`auth`	object	—	Authentication configuration (see below)
`incidentPolicy`	object	—	Trigger, confirmation, and recovery rules

Monitor types

HTTP

monitors:
  - name: API Health
    type: HTTP
    config:
      url: https://api.example.com/health
      method: GET
      verifyTls: true
      customHeaders:
        Accept: application/json
      requestBody: '{"ping": true}'
      contentType: application/json
    frequencySeconds: 60
    regions: [us-east, eu-west]

Config field	Type	Required	Description
`url`	string	Yes	Target URL to check
`method`	string	—	HTTP method: `GET`, `POST`, `PUT`, `PATCH`, `DELETE`, `HEAD`
`verifyTls`	boolean	—	Validate TLS certificate (default: `true`)
`customHeaders`	map	—	Custom request headers
`requestBody`	string	—	Request body for POST/PUT/PATCH
`contentType`	string	—	Content-Type header value

DNS

monitors:
  - name: DNS Resolution
    type: DNS
    config:
      hostname: example.com
      recordTypes: [A, AAAA]
      nameservers: [8.8.8.8]
      timeoutMs: 5000

Config field	Type	Required	Description
`hostname`	string	Yes	Domain name to resolve
`recordTypes`	string[]	—	DNS record types to query
`nameservers`	string[]	—	Custom nameservers
`timeoutMs`	integer	—	Query timeout in milliseconds
`totalTimeoutMs`	integer	—	Total timeout across retries

TCP

monitors:
  - name: Database Port
    type: TCP
    config:
      host: db.example.com
      port: 5432
      timeoutMs: 5000

Config field	Type	Required	Description
`host`	string	Yes	Target hostname or IP
`port`	integer	—	TCP port (1–65535)
`timeoutMs`	integer	—	Connection timeout in milliseconds

ICMP

monitors:
  - name: Server Ping
    type: ICMP
    config:
      host: 10.0.1.5
      packetCount: 3
      timeoutMs: 5000

Config field	Type	Required	Description
`host`	string	Yes	Target hostname or IP
`packetCount`	integer	—	Number of ICMP packets to send
`timeoutMs`	integer	—	Timeout in milliseconds

Heartbeat

monitors:
  - name: Nightly Backup
    type: HEARTBEAT
    config:
      expectedInterval: 86400
      gracePeriod: 3600

Config field	Type	Required	Description
`expectedInterval`	integer	—	Expected ping interval in seconds
`gracePeriod`	integer	—	Grace period before marking as missed

Heartbeat monitors generate a unique ping URL. Your service sends a request to this URL on each run. If no ping arrives within the expected interval plus grace period, an incident is created.

MCP Server

monitors:
  - name: MCP Health
    type: MCP_SERVER
    config:
      command: npx
      args: [-y, @modelcontextprotocol/server-everything]
      env:
        API_KEY: ${MCP_API_KEY}

Config field	Type	Required	Description
`command`	string	Yes	Command to launch the MCP server
`args`	string[]	—	Command arguments
`env`	map	—	Environment variables

Assertions

Assertions define pass/fail criteria for each check:

monitors:
  - name: API Health
    type: HTTP
    config:
      url: https://api.example.com/health
    assertions:
      - config:
          type: status_code
          expected: "200"
          operator: equals
        severity: fail
      - config:
          type: response_time
          thresholdMs: 2000
        severity: warn

Field	Type	Required	Description
`config.type`	string	Yes	Assertion type discriminator (snake_case)
`config`	object	Yes	Type-specific assertion fields (includes `type`)
`severity`	string	—	`fail` or `warn` (default: `fail`)

Assertion types are snake_case strings on config.type. Common HTTP assertions include status_code, response_time, header_present, body_contains, and json_path.

Authentication

For monitors that need to authenticate against protected endpoints:

monitors:
  - name: Protected API
    type: HTTP
    config:
      url: https://api.example.com/internal/health
    auth:
      type: bearer
      secret: API_TOKEN

Auth type	Fields	Description
`bearer`	`secret`	Bearer token from vault secret
`basic`	`secret`	Base64 `username:password` from vault secret
`api_key`	`headerName`, `secret`	Custom header with API key
`header`	`headerName`, `secret`	Custom header with arbitrary value

The secret field references a key from the secrets section or vault.

Incident policy

Control when incidents are created, confirmed, and resolved:

monitors:
  - name: API Health
    type: HTTP
    config:
      url: https://api.example.com/health
    incidentPolicy:
      triggerRules:
        - type: consecutive_failures
          severity: down
          scope: per_region
          count: 3
        - type: response_time
          severity: degraded
          thresholdMs: 5000
          aggregationType: p95
      confirmation:
        type: multi_region
        minRegionsFailing: 2
        maxWaitSeconds: 120
      recovery:
        consecutiveSuccesses: 2
        minRegionsPassing: 2
        cooldownMinutes: 5

Trigger rules

Field	Type	Description
`type`	string	`consecutive_failures`, `failures_in_window`, or `response_time`
`severity`	string	`down` or `degraded`
`scope`	string	`per_region` or `any_region`
`count`	integer	Failure count threshold
`windowMinutes`	integer	Time window for `failures_in_window`
`thresholdMs`	integer	Response time threshold for `response_time`
`aggregationType`	string	`all_exceed`, `average`, `p95`, `max`

Confirmation

Field	Type	Description
`type`	string	`multi_region`
`minRegionsFailing`	integer	Minimum failing regions to confirm
`maxWaitSeconds`	integer	Maximum time to wait for confirmation

Recovery

Field	Type	Description
`consecutiveSuccesses`	integer	Consecutive passing checks to recover
`minRegionsPassing`	integer	Minimum passing regions to recover
`cooldownMinutes`	integer	Delay before auto-resolving

Resource groups

Group monitors into composite health views:

resourceGroups:
  - name: Payment Service
    description: All payment-related monitors
    alertPolicy: Critical Alerts
    defaultFrequency: 30
    defaultRegions: [us-east, eu-west]
    healthThresholdType: PERCENTAGE
    healthThresholdValue: 80
    suppressMemberAlerts: true
    monitors: [Payment API, Payment Webhook, Stripe Health]

Field	Type	Description
`name`	string	Resource group name
`description`	string	Human-readable description
`alertPolicy`	string	Notification policy name for group alerts
`defaultFrequency`	integer	Default check frequency for members
`defaultRegions`	string[]	Default probe regions for members
`defaultAlertChannels`	string[]	Default alert channel names
`defaultEnvironment`	string	Default environment slug
`healthThresholdType`	string	`COUNT` or `PERCENTAGE`
`healthThresholdValue`	number	Threshold value
`suppressMemberAlerts`	boolean	Suppress individual member alerts
`confirmationDelaySeconds`	integer	Delay before confirming group incident
`recoveryCooldownMinutes`	integer	Delay before auto-resolving
`monitors`	string[]	Monitor names in this group
`services`	string[]	Dependency service slugs in this group

Monitors in YAML

Monitor fields

Monitor types

HTTP

DNS

TCP

ICMP

Heartbeat

MCP Server

Assertions

Authentication

Incident policy

Trigger rules

Confirmation

Recovery

Resource groups

Next steps

Alert channels

Tags and secrets

Documentation Index

​Monitor fields

​Monitor types

​HTTP

​DNS

​TCP

​ICMP

​Heartbeat

​MCP Server

​Assertions

​Authentication

​Incident policy

​Trigger rules

​Confirmation

​Recovery

​Resource groups

​Next steps

Alert channels

Tags and secrets

Monitor fields

Monitor types

HTTP

DNS

TCP

ICMP

Heartbeat

MCP Server

Assertions

Authentication

Incident policy

Trigger rules

Confirmation

Recovery

Resource groups

Next steps