Skip to main content
Uptime monitoring continuously checks that your services are available and responding correctly. It’s the foundation of operational reliability — the first line of defense between an outage and your users finding out before you do.

How it works

An uptime monitor sends requests to your service at regular intervals from one or more locations. Each check verifies that:
  1. The service responds — the connection doesn’t time out or refuse
  2. The response is correct — status codes, headers, or body content match expectations
  3. The response is fast enough — latency stays within acceptable thresholds
When a check fails, the monitoring system creates an incident and notifies your team through configured alert channels.

Types of uptime checks

Check typeWhat it testsExample
HTTP/HTTPSWeb endpoint availability and responseGET /health returns 200
TCPPort connectivityDatabase on port 5432 is accepting connections
DNSDomain resolutionapi.example.com resolves to the correct IP
ICMP (ping)Network reachabilityServer at 10.0.1.5 responds to ping
HeartbeatService self-reportingCron job pings a URL on each successful run

Why uptime monitoring matters

Detect outages before users report them

Without monitoring, you learn about outages from support tickets, social media, or worse — lost revenue. Automated checks detect problems in seconds rather than hours.

Measure reliability objectively

Uptime monitoring provides hard data: how many checks passed, what your availability percentage is, and how long incidents lasted. This feeds SLA compliance reports and internal reliability goals.

Hold third parties accountable

If your application depends on external APIs or cloud services, monitoring their uptime gives you evidence when their outage impacts your users.

Build confidence in deployments

When you deploy changes, monitoring immediately tells you whether the new version is healthy — faster than any log-watching or manual testing.

Key concepts

Check frequency

How often the monitor sends a request. Higher frequency (e.g., every 30 seconds) detects problems faster but costs more and can generate noise. Lower frequency (e.g., every 5 minutes) is sufficient for less critical services.

Multi-region checks

Running checks from multiple geographic locations (e.g., US East, EU West) distinguishes between global outages and regional network issues. If only one region reports failure, it might be a network path problem rather than a service outage.

Assertions

Conditions that define whether a check passes or fails. Beyond “did the server respond,” assertions can verify:
  • HTTP status code equals 200
  • Response time is under 2 seconds
  • Response body contains a specific string
  • SSL certificate doesn’t expire within 30 days

Incident lifecycle

When checks fail, the monitoring system follows a defined process:
  1. Detection — check fails in one or more regions
  2. Confirmation — additional checks confirm the failure isn’t transient
  3. Notification — alert channels (Slack, PagerDuty, email) are triggered
  4. Resolution — checks start passing again, incident is closed

DevHelm monitoring

DevHelm supports all check types listed above, plus MCP Server monitoring for AI infrastructure. Get started:

Create your first monitor

Set up an HTTP monitor in under 5 minutes.

Monitor types

Full reference for all DevHelm monitor types.