Skip to main content
This page covers the most common issues across the DevHelm platform and how to resolve them.

Authentication

Cause: Missing or invalid API token.Fix:
  1. Verify the token is set:
echo $DEVHELM_API_TOKEN
  1. Confirm the token is valid:
curl -s -o /dev/null -w "%{http_code}" \
  -H "Authorization: Bearer $DEVHELM_API_TOKEN" \
  -H "x-phelm-org-id: YOUR_ORG_ID" \
  https://api.devhelm.io/platform/me
  1. If using the CLI, check which context is active:
devhelm auth context list
If the token was revoked, generate a new one in the dashboard under Settings → API Keys.
Cause: The token is valid but doesn’t have access to the specified organization.Fix: Verify the x-phelm-org-id header (or --org-id CLI flag) matches the organization the token belongs to. Tokens are scoped to the organization that created them.
Cause: No login session or context configured.Fix: Either log in interactively or set the environment variable:
# Option 1: interactive login
devhelm auth login

# Option 2: environment variable
export DEVHELM_API_TOKEN=dh_live_...
export DEVHELM_ORG_ID=your-org-id

Monitors

Possible causes:
  • Firewall blocking probe IPs — DevHelm probes need to reach your endpoint. If you restrict inbound traffic, allowlist the probe IP ranges listed in Regions.
  • Assertion too strict — a response time assertion under 200ms may fail intermittently. Check the assertion configuration in the monitor detail view.
  • Single-region failure — if only one region reports failure, it may be a network path issue. Enable multi-region confirmation in your incident policy.
  • DNS resolution — the probe may resolve your domain differently than your local machine. Check the DNS records your domain returns from external resolvers.
Debug: Pull recent check results to see the actual response:
devhelm monitors results MONITOR_ID --limit 10
Possible causes:
  • Monitoring the wrong endpoint — health endpoints can pass while user-facing routes fail. Monitor the actual endpoints users hit.
  • Assertions too loose — checking only status code 200 won’t catch a page that returns 200 with an error message in the body. Add a body content assertion.
  • Client-side issue — the server responds correctly but the frontend has JavaScript errors. Synthetic monitoring tests server responses, not client rendering.
Cause: The endpoint takes longer to respond than the timeout threshold.Fix:
  1. Check your endpoint’s actual response time from an external location
  2. If the endpoint is genuinely slow, increase the timeout in the monitor’s HTTP configuration
  3. If it’s only slow from certain regions, the issue may be geographic latency — consider monitoring from closer regions only
Cause: The grace period is too short for your job’s execution time.Fix: Set the heartbeat grace period to at least 2x the expected job duration. If your cron job takes 5 minutes to run, set the grace period to at least 10 minutes.

Alerting

Check these in order:
  1. Notification policy exists — the monitor must be matched by at least one notification policy. Check the policy’s match rules (tags, regions, monitor name patterns).
  2. Alert channels are assigned — the notification policy must have at least one channel in its escalation steps.
  3. Channel is verified — some channels (Slack, PagerDuty) require initial setup. Test the channel:
devhelm alert-channels test CHANNEL_ID
  1. Suppression is not active — check if a maintenance window or suppression rule is silencing the alert.
  2. Incident policy confirmation — if your incident policy requires multiple consecutive failures or multi-region confirmation, the alert won’t fire until the threshold is met.
Fix:
  • Increase the consecutive failure threshold in your incident policy (e.g., 2–3 failures before confirming)
  • Enable multi-region confirmation so single-region blips don’t alert
  • Use resource groups with suppressMemberAlerts to consolidate related monitors
  • Route low-severity alerts to a Slack channel instead of PagerDuty
  • Set up maintenance windows for planned deploys
See Reducing false positives for a deeper guide.
Possible causes:
  • Webhook URL expired — Slack webhook URLs can be revoked if the app is reinstalled. Re-create the webhook in Slack and update the channel config.
  • Channel archived — messages to archived Slack channels are silently dropped.
  • Rate limited by Slack — Slack throttles incoming webhooks. If you’re sending many alerts simultaneously, some may be delayed.
Test: Send a test notification to verify delivery:
devhelm alert-channels test CHANNEL_ID

Deploys (Monitoring as Code)

Cause: A previous deploy didn’t release its lock (crashed, timed out, or is still running).Fix:
  1. Check if another deploy is actually running (CI pipeline, teammate)
  2. If the lock is stale, force-unlock it:
devhelm deploy force-unlock
Locks expire automatically after 10 minutes, but force-unlock is safe if you’re sure no other deploy is in progress.
Cause: The YAML config has schema errors.Fix: Run validation separately to see detailed errors:
devhelm validate -f devhelm.yml
Common validation issues:
  • Missing required fields (name, type on monitors)
  • Invalid enum values (e.g., type: http instead of type: HTTP)
  • Referencing a secret that doesn’t exist in the secrets section
  • Invalid frequency value (must be 30–86400)
Cause: Resources that exist in the API but are missing from your YAML file are pruned on deploy.Fix:
  • Always run devhelm plan first to preview changes before deploying
  • If a resource was created in the dashboard and you don’t want it pruned, add it to your YAML file
  • The plan output shows deletions as - delete: Monitor "..." — review carefully
See Drift and locking for details on how DevHelm handles resources not in the config file.
Cause: The YAML field names may not match what the API expects. DevHelm YAML uses camelCase keys.Fix: Verify field names match the schema in YAML file format. Common mistakes:
  • frequency_seconds → should be frequency (value in seconds)
  • alert_channels → should be alertChannels
  • notification_policies → should be notificationPolicies

CLI

Cause: The CLI is not installed or not in your PATH.Fix:
npm install -g devhelm
devhelm --version
If using npx: npx devhelm monitors listSee CLI installation for full setup instructions.
Cause: The command may have returned an error that’s hidden in table format.Fix: Switch to JSON output for full details:
devhelm monitors list --output json
Or enable verbose mode:
devhelm monitors list --verbose
Cause: The browser-based login flow can’t open a browser (common in headless environments like CI or SSH sessions).Fix: Use token-based auth instead:
export DEVHELM_API_TOKEN=dh_live_...
export DEVHELM_ORG_ID=your-org-id
Or create a context manually:
devhelm auth context create --name ci \
  --api-token "$DEVHELM_API_TOKEN" \
  --org-id "$DEVHELM_ORG_ID"

SDKs

Cause: The client was initialized without a valid token or the token expired.Fix:
import { DevHelm } from "@devhelm/sdk";

const client = new DevHelm({
  apiToken: process.env.DEVHELM_API_TOKEN,
  orgId: process.env.DEVHELM_ORG_ID,
});
Verify the environment variables are set in your shell before running the script.
Cause: The SDK can’t reach the API (network issue, firewall, or wrong base URL).Fix:
  1. Test connectivity: curl https://api.devhelm.io/health
  2. If using a custom base URL, verify it:
client = DevHelm(
    api_token="...",
    org_id="...",
    base_url="https://api.devhelm.io",  # default
)
  1. Check proxy settings if you’re behind a corporate firewall

MCP Server

Cause: Configuration file syntax error or wrong file location.Fix:
  1. Verify the config file path:
    • Cursor: .cursor/mcp.json in your project root
    • Claude Desktop: ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
  2. Validate JSON syntax (trailing commas, missing quotes)
  3. Restart the AI client after editing the config
  4. Check that npx @devhelm/mcp-server runs without errors in your terminal
See MCP configuration for setup details.
Cause: Missing or invalid environment variables in the MCP server config.Fix: Ensure all required env vars are set in the MCP config:
{
  "mcpServers": {
    "devhelm": {
      "command": "npx",
      "args": ["-y", "@devhelm/mcp-server"],
      "env": {
        "DEVHELM_API_TOKEN": "dh_live_...",
        "DEVHELM_ORG_ID": "your-org-id"
      }
    }
  }
}

Terraform

Cause: The API normalizes or transforms a field value differently than what was specified in the config (e.g., trailing slashes on URLs, case normalization).Fix: Run terraform plan again. If the diff shows the API-normalized value, update your .tf file to match. If the issue persists, open an issue.
Cause: Another process (dashboard, CLI deploy, another Terraform run) modified the resource concurrently.Fix: Run terraform refresh to sync state with the API, then re-plan and apply.

Still stuck?

If none of the above resolve your issue:
  1. Check the error codes reference for the specific HTTP status and message
  2. Review rate limits if you’re seeing 429 responses
  3. Contact support at support@devhelm.io with:
    • The exact error message
    • The API endpoint or CLI command you ran
    • Your organization ID
    • Timestamps of when the issue occurred