Troubleshooting

This page covers the most common issues across the DevHelm platform and how to resolve them.

Authentication

401 Unauthorized on every request

Cause: Missing or invalid API token.Fix:

Verify the token is set:

echo $DEVHELM_API_TOKEN

Confirm the token is valid:

API

curl -s -o /dev/null -w "%{http_code}" \
  -H "Authorization: Bearer $DEVHELM_API_TOKEN" \
  https://api.devhelm.io/api/v1/auth/me

If using the CLI, check which context is active:

devhelm auth context list

If the token was revoked, generate a new one in the dashboard under Settings → API Keys.

403 Forbidden with a valid token

Cause: The token is valid but doesn’t have access to the specified organization.Fix: Verify the x-phelm-org-id header (or --org-id CLI flag) matches the organization the token belongs to. Tokens are scoped to the organization that created them.

CLI: 'No auth context found'

Cause: No login session or context configured.Fix: Either log in interactively or set the environment variable:

# Option 1: interactive login
devhelm auth login

# Option 2: environment variable
export DEVHELM_API_TOKEN=dh_live_...

Or create a named context:

devhelm auth context create ci --token "$DEVHELM_API_TOKEN"

Monitors

Monitor shows DOWN but the service is up

Possible causes:

Firewall blocking probe IPs — DevHelm probes need to reach your endpoint. If you restrict inbound traffic, allowlist the probe IP ranges listed in Regions.
Assertion too strict — a response time assertion under 200ms may fail intermittently. Check the assertion configuration in the monitor detail view.
Single-region failure — if only one region reports failure, it may be a network path issue. Enable multi-region confirmation in your incident policy.
DNS resolution — the probe may resolve your domain differently than your local machine. Check the DNS records your domain returns from external resolvers.

Debug: Pull recent check results to see the actual response:

devhelm monitors results <monitor-id> --limit 10

Monitor shows UP but users report issues

Possible causes:

Monitoring the wrong endpoint — health endpoints can pass while user-facing routes fail. Monitor the actual endpoints users hit.
Assertions too loose — checking only status code 200 won’t catch a page that returns 200 with an error message in the body. Add a body content assertion.
Client-side issue — the server responds correctly but the frontend has JavaScript errors. Synthetic monitoring tests server responses, not client rendering.

Check results show timeout errors

Cause: The endpoint takes longer to respond than the timeout threshold.Fix:

Check your endpoint’s actual response time from an external location
If the endpoint is genuinely slow, increase the timeout in the monitor’s HTTP configuration
If it’s only slow from certain regions, the issue may be geographic latency — consider monitoring from closer regions only

Heartbeat monitor fires after a brief delay

Cause: The grace period is too short for your job’s execution time.Fix: Set the heartbeat grace period to at least 2x the expected job duration. If your cron job takes 5 minutes to run, set the grace period to at least 10 minutes.

Alerting

No alerts when a monitor goes DOWN

Check these in order:

Notification policy exists — the monitor must be matched by at least one notification policy. Check the policy’s match rules (tags, regions, monitor name patterns).
Alert channels are assigned — the notification policy must have at least one channel in its escalation steps.
Channel is verified — some channels (Slack, PagerDuty) require initial setup. Test the channel:

devhelm alert-channels test <channel-id>

Suppression is not active — check if a maintenance window or suppression rule is silencing the alert.
Incident policy confirmation — if your incident policy requires multiple consecutive failures or multi-region confirmation, the alert won’t fire until the threshold is met.

Too many alerts (alert fatigue)

Fix:

Increase the consecutive failure threshold in your incident policy (e.g., 2–3 failures before confirming)
Enable multi-region confirmation so single-region blips don’t alert
Use resource groups with suppressMemberAlerts to consolidate related monitors
Route low-severity alerts to a Slack channel instead of PagerDuty
Set up maintenance windows for planned deploys

See Reducing false positives for a deeper guide.

Slack messages not arriving

Possible causes:

Webhook URL expired — Slack webhook URLs can be revoked if the app is reinstalled. Re-create the webhook in Slack and update the channel config.
Channel archived — messages to archived Slack channels are silently dropped.
Rate limited by Slack — Slack throttles incoming webhooks. If you’re sending many alerts simultaneously, some may be delayed.

Test: Send a test notification to verify delivery:

devhelm alert-channels test <channel-id>

Deploys (Monitoring as Code)

'Deploy lock held by another process'

Cause: A previous deploy didn’t release its lock (crashed, timed out, or is still running).Fix:

Check if another deploy is actually running (CI pipeline, teammate)
If the lock is stale, force-unlock it:

devhelm deploy force-unlock

Locks expire automatically after 30 minutes, but force-unlock is safe if you’re sure no other deploy is in progress.

'Validation failed' during deploy

Cause: The YAML config has schema errors.Fix: Run validation separately to see detailed errors:

devhelm validate devhelm.yml

Common validation issues:

Missing required fields (name, type on monitors)
Invalid enum values (e.g., type: http instead of type: HTTP)
Referencing a secret that doesn’t exist in the secrets section
Invalid frequency value (must be 30–86400)

Deploy deletes resources unexpectedly

Cause: You passed --prune and some resources that exist in the API are missing from your YAML file.Fix:

Always run devhelm plan first to preview changes before deploying
Pruning is opt-in — resources are only deleted when you pass --prune or --prune-all to devhelm deploy
If a resource was created in the dashboard and you don’t want it deleted, either add it to your YAML file or don’t use --prune
The plan output shows deletions as - delete: Monitor "..." — review carefully

See Drift and locking for details on how DevHelm handles resources not in the config file.

Deploy shows no changes but config was updated

Cause: The YAML field names may not match what the API expects. DevHelm YAML uses camelCase keys.Fix: Verify field names match the schema in YAML file format. Common mistakes:

frequency_seconds → should be frequencySeconds
alert_channels → should be alertChannels
notification_policies → should be notificationPolicies

CLI

'command not found: devhelm'

Cause: The CLI is not installed or not in your PATH.Fix:

npm install -g devhelm
devhelm --version

If using npx: npx devhelm monitors listSee CLI installation for full setup instructions.

CLI output is empty or truncated

Cause: The command may have returned an error that’s hidden in table format.Fix: Switch to JSON output for full details:

devhelm monitors list --output json

Or enable verbose mode:

devhelm monitors list --verbose

Cause: The browser-based login flow can’t open a browser (common in headless environments like CI or SSH sessions).Fix: Use token-based auth instead:

export DEVHELM_API_TOKEN=dh_live_...

Or create a named context:

devhelm auth context create ci --token "$DEVHELM_API_TOKEN"

SDKs

TypeScript: 'DevhelmApiError: Unauthorized'

Cause: The client was initialized without a valid token or the token expired.Fix:

import { Devhelm } from "@devhelm/sdk";

const client = new Devhelm({
  token: process.env.DEVHELM_API_TOKEN!,
  orgId: process.env.DEVHELM_ORG_ID,
});

Verify the environment variables are set in your shell before running the script.

Python: 'DevhelmTransportError' or 'httpx.ConnectError'

Cause: The SDK can’t reach the API (network issue, firewall, or wrong base URL).Fix:

Test connectivity: curl https://api.devhelm.io/health
If using a custom base URL, verify it:

from devhelm import Devhelm

client = Devhelm(
    token="...",
    org_id="...",
    base_url="https://api.devhelm.io",  # default
)

Check proxy settings if you’re behind a corporate firewall

MCP Server

MCP server not appearing in Cursor/Claude

Cause: Configuration file syntax error or wrong file location.Fix:

Verify the config file path:
- Cursor: .cursor/mcp.json in your project root
- Claude Desktop: ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
Validate JSON syntax (trailing commas, missing quotes)
Restart the AI client after editing the config
Check that uvx devhelm-mcp-server runs without errors in your terminal

See MCP configuration for setup details.

MCP tools return authentication errors

Cause: Missing or invalid environment variables in the MCP server config.Fix: Ensure all required env vars are set in the MCP config:

{
  "mcpServers": {
    "devhelm": {
      "command": "uvx",
      "args": ["devhelm-mcp-server"],
      "env": {
        "DEVHELM_API_TOKEN": "dh_live_...",
        "DEVHELM_ORG_ID": "your-org-id"
      }
    }
  }
}

Terraform

'Provider produced inconsistent result after apply'

Cause: The API normalizes or transforms a field value differently than what was specified in the config (e.g., trailing slashes on URLs, case normalization).Fix: Run terraform plan again. If the diff shows the API-normalized value, update your .tf file to match. If the issue persists, open an issue.

'Error: 409 Conflict' during apply

Cause: Another process (dashboard, CLI deploy, another Terraform run) modified the resource concurrently.Fix: Run terraform refresh to sync state with the API, then re-plan and apply.

Still stuck?

If none of the above resolve your issue:

Check the error codes reference for the specific HTTP status and message
Review rate limits if you’re seeing 429 responses
Contact support at support@devhelm.io with:
- The exact error message and requestId
- The API endpoint or CLI command you ran
- Your organization ID
- Timestamps of when the issue occurred

​Authentication

​Monitors

​Alerting

​Deploys (Monitoring as Code)

​CLI

​SDKs

​MCP Server

​Terraform

​Still stuck?

Authentication

Monitors

Alerting

Deploys (Monitoring as Code)

CLI

SDKs

MCP Server

Terraform

Still stuck?