Beyond “is it up?”
A basic uptime check tells you whether an API responds. Production API monitoring should verify:- Correctness — the response body contains expected data
- Performance — response time stays within budget
- Authentication — protected endpoints reject invalid credentials
- Error rates — the API isn’t silently returning errors
Health check endpoints
Design dedicated health endpoints that test real dependencies:- Tests actual database connectivity (not just process status)
- Returns structured data that monitors can assert on
- Responds quickly (under 500ms) to avoid timeout false positives
- Is unauthenticated (monitoring probes shouldn’t need API keys for health checks)
Assertion strategies
Status code checks
The simplest assertion — verify the API returns the expected HTTP status:200 OKfor successful requests401 Unauthorizedfor missing auth (confirms auth middleware is running)301/302for redirect endpoints
Response body validation
Check that the response contains expected fields or values:- JSON path assertion:
$.statusequals"healthy" - Body contains: response includes
"database": "connected" - Schema validation: response matches expected structure
Response time thresholds
Set latency budgets based on your SLA:| Endpoint type | Typical threshold |
|---|---|
| Health check | < 500ms |
| Public API | < 1s |
| Search/aggregate | < 2s |
| Webhook delivery | < 5s |
SSL certificate monitoring
Check that certificates don’t expire unexpectedly. Alert 30 days before expiration to give your team time to renew.Multi-region monitoring
Run API checks from multiple geographic locations to:- Detect regional outages (CDN, DNS, or cloud region issues)
- Measure latency from your users’ locations
- Validate that geo-routing works correctly
Monitoring authenticated endpoints
Protected APIs need authentication for monitoring checks:- Bearer tokens — store in your monitoring platform’s secret vault
- API keys — use a dedicated monitoring key with read-only permissions
- Basic auth — for internal services behind network boundaries
Alert on degradation, not just failure
Don’t wait for complete failure to alert:- Degraded response time — p95 above 2s triggers a warning
- Elevated error rate — more than 1% of checks fail in a 5-minute window
- Partial failures — health check passes but reports a degraded dependency
DevHelm API monitoring
HTTP monitors
Create HTTP monitors with assertions and multi-region checks.
Authenticated endpoints
Monitor protected APIs with vault secrets.