Skip to main content
The monitors section of devhelm.yml defines your monitoring checks. Each monitor specifies a type, configuration, assertions, and scheduling.

Monitor fields

FieldTypeRequiredDescription
namestringYesUnique monitor name
typestringYesHTTP, DNS, TCP, ICMP, HEARTBEAT, MCP_SERVER
configobjectYesType-specific configuration (see below)
frequencyintegerCheck frequency in seconds (30–86400)
enabledbooleanWhether the monitor runs checks (default: true)
regionsstring[]Probe regions (e.g., us-east, eu-west)
environmentstringEnvironment slug for variable substitution
tagsstring[]Tag names to attach
alertChannelsstring[]Alert channel names to notify on incidents
assertionsobject[]Pass/fail criteria (see below)
authobjectAuthentication configuration (see below)
incidentPolicyobjectTrigger, confirmation, and recovery rules

Monitor types

HTTP

monitors:
  - name: API Health
    type: HTTP
    config:
      url: https://api.example.com/health
      method: GET
      verifyTls: true
      customHeaders:
        Accept: application/json
      requestBody: '{"ping": true}'
      contentType: application/json
    frequency: 60
    regions: [us-east, eu-west]
Config fieldTypeRequiredDescription
urlstringYesTarget URL to check
methodstringHTTP method: GET, POST, PUT, PATCH, DELETE, HEAD
verifyTlsbooleanValidate TLS certificate (default: true)
customHeadersmapCustom request headers
requestBodystringRequest body for POST/PUT/PATCH
contentTypestringContent-Type header value

DNS

monitors:
  - name: DNS Resolution
    type: DNS
    config:
      hostname: example.com
      recordTypes: [A, AAAA]
      nameservers: [8.8.8.8]
      timeoutMs: 5000
Config fieldTypeRequiredDescription
hostnamestringYesDomain name to resolve
recordTypesstring[]DNS record types to query
nameserversstring[]Custom nameservers
timeoutMsintegerQuery timeout in milliseconds
totalTimeoutMsintegerTotal timeout across retries

TCP

monitors:
  - name: Database Port
    type: TCP
    config:
      host: db.example.com
      port: 5432
      timeoutMs: 5000
Config fieldTypeRequiredDescription
hoststringYesTarget hostname or IP
portintegerTCP port (1–65535)
timeoutMsintegerConnection timeout in milliseconds

ICMP

monitors:
  - name: Server Ping
    type: ICMP
    config:
      host: 10.0.1.5
      packetCount: 3
      timeoutMs: 5000
Config fieldTypeRequiredDescription
hoststringYesTarget hostname or IP
packetCountintegerNumber of ICMP packets to send
timeoutMsintegerTimeout in milliseconds

Heartbeat

monitors:
  - name: Nightly Backup
    type: HEARTBEAT
    config:
      expectedInterval: 86400
      gracePeriod: 3600
Config fieldTypeRequiredDescription
expectedIntervalintegerExpected ping interval in seconds
gracePeriodintegerGrace period before marking as missed
Heartbeat monitors generate a unique ping URL. Your service sends a request to this URL on each run. If no ping arrives within the expected interval plus grace period, an incident is created.

MCP Server

monitors:
  - name: MCP Health
    type: MCP_SERVER
    config:
      command: npx
      args: [-y, @modelcontextprotocol/server-everything]
      env:
        API_KEY: ${MCP_API_KEY}
Config fieldTypeRequiredDescription
commandstringYesCommand to launch the MCP server
argsstring[]Command arguments
envmapEnvironment variables

Assertions

Assertions define pass/fail criteria for each check:
monitors:
  - name: API Health
    type: HTTP
    config:
      url: https://api.example.com/health
    assertions:
      - type: StatusCodeAssertion
        config:
          expected: "200"
          operator: equals
        severity: fail
      - type: ResponseTimeAssertion
        config:
          thresholdMs: 2000
        severity: warn
FieldTypeRequiredDescription
typestringYesAssertion type name
configobjectType-specific assertion config
severitystringfail or warn (default: fail)
Assertion types vary by monitor type. Common HTTP assertions include StatusCodeAssertion, ResponseTimeAssertion, HeaderAssertion, BodyContainsAssertion, and JsonPathAssertion.

Authentication

For monitors that need to authenticate against protected endpoints:
monitors:
  - name: Protected API
    type: HTTP
    config:
      url: https://api.example.com/internal/health
    auth:
      type: BearerAuthConfig
      secret: API_TOKEN
Auth typeFieldsDescription
BearerAuthConfigsecretBearer token from vault secret
BasicAuthConfigsecretBase64 username:password from vault secret
ApiKeyAuthConfigheaderName, secretCustom header with API key
HeaderAuthConfigheaderName, secretCustom header with arbitrary value
The secret field references a key from the secrets section or vault.

Incident policy

Control when incidents are created, confirmed, and resolved:
monitors:
  - name: API Health
    type: HTTP
    config:
      url: https://api.example.com/health
    incidentPolicy:
      triggerRules:
        - type: consecutive_failures
          severity: down
          scope: per_region
          count: 3
        - type: response_time
          severity: degraded
          thresholdMs: 5000
          aggregationType: p95
      confirmation:
        type: multi_region
        minRegionsFailing: 2
        maxWaitSeconds: 120
      recovery:
        consecutiveSuccesses: 2
        minRegionsPassing: 2
        cooldownMinutes: 5

Trigger rules

FieldTypeDescription
typestringconsecutive_failures, failures_in_window, or response_time
severitystringdown or degraded
scopestringper_region or any_region
countintegerFailure count threshold
windowMinutesintegerTime window for failures_in_window
thresholdMsintegerResponse time threshold for response_time
aggregationTypestringall_exceed, average, p95, max

Confirmation

FieldTypeDescription
typestringmulti_region
minRegionsFailingintegerMinimum failing regions to confirm
maxWaitSecondsintegerMaximum time to wait for confirmation

Recovery

FieldTypeDescription
consecutiveSuccessesintegerConsecutive passing checks to recover
minRegionsPassingintegerMinimum passing regions to recover
cooldownMinutesintegerDelay before auto-resolving

Resource groups

Group monitors into composite health views:
resourceGroups:
  - name: Payment Service
    description: All payment-related monitors
    alertPolicy: Critical Alerts
    defaultFrequency: 30
    defaultRegions: [us-east, eu-west]
    healthThresholdType: PERCENTAGE
    healthThresholdValue: 80
    suppressMemberAlerts: true
    monitors: [Payment API, Payment Webhook, Stripe Health]
FieldTypeDescription
namestringResource group name
descriptionstringHuman-readable description
alertPolicystringNotification policy name for group alerts
defaultFrequencyintegerDefault check frequency for members
defaultRegionsstring[]Default probe regions for members
defaultAlertChannelsstring[]Default alert channel names
defaultEnvironmentstringDefault environment slug
healthThresholdTypestringCOUNT or PERCENTAGE
healthThresholdValuenumberThreshold value
suppressMemberAlertsbooleanSuppress individual member alerts
confirmationDelaySecondsintegerDelay before confirming group incident
recoveryCooldownMinutesintegerDelay before auto-resolving
monitorsstring[]Monitor names in this group
servicesstring[]Dependency service slugs in this group

Next steps

Alert channels

Configure notification channels in YAML.

Tags and secrets

Tags, secrets, and environment variables.