Internal communication
Dedicated incident channel
Create a dedicated channel (Slack, Teams) for each significant incident:- Name it clearly:
#incident-2025-03-15-api-outage - Pin the current status and timeline
- Keep discussion focused — off-topic conversation goes elsewhere
Regular status updates
Post updates at fixed intervals, even if nothing has changed:14:30 update: Still investigating. Database team is looking at connection pool exhaustion. No ETA yet. Next update in 15 minutes.“No update” is worse than “still working on it.” Silence breeds anxiety.
What to communicate internally
| Timing | Content |
|---|---|
| Incident opened | What’s broken, severity, who’s responding |
| Every 15–30 min | Current status, what’s been tried, what’s next |
| Mitigation applied | What was done, is it working, risk of recurrence |
| Resolved | Confirmation, duration, brief root cause |
| Post-incident | Postmortem link, action items |
External communication
Status page updates
Your status page is the primary external communication channel. Update it at every stage: Investigating:We are investigating reports of elevated API response times. Some requests may be slower than usual.Identified:
We have identified the cause of elevated API response times and are implementing a fix. API functionality is not affected.Monitoring:
A fix has been applied and we are monitoring the results. Response times are returning to normal.Resolved:
The issue causing elevated API response times has been resolved. All systems are operating normally.
Principles for external communication
- Acknowledge quickly — even before you know the cause, acknowledge the problem
- Be honest — don’t minimize or hide impact. Users trust transparency.
- Use plain language — “Some API requests are failing” not “We’re experiencing degraded throughput on our ingestion pipeline”
- Give timelines — “We expect this to be resolved within 2 hours” (and update if it changes)
- Don’t speculate — share what you know, not what you think might be happening
What NOT to communicate
- Internal implementation details
- Blame or individual names
- Speculative root causes
- Security vulnerability details (handle through security disclosure process)
Communication templates
Prepare templates before incidents happen:Initial acknowledgement
We are aware of [brief description of issue]. Our team is investigating. We will provide updates every [interval]. Current status: [Investigating/Identified].
Progress update
Update on [issue]: We have identified [brief cause] and are [action being taken]. Expected resolution: [time estimate or “no ETA yet”]. Next update in [interval].
Resolution
[Issue] has been resolved. [Brief explanation of what happened and fix]. Duration: [X hours/minutes]. We apologize for the inconvenience and will publish a full post-incident report.
Choosing communication channels
| Audience | Channel | Cadence |
|---|---|---|
| Incident responders | Dedicated Slack channel | Real-time |
| Engineering team | Broader Slack channel | Every 30 min |
| Company leadership | Email or Slack summary | Hourly for SEV-1 |
| Customers | Status page | Every 30 min |
| Support team | Internal FAQ doc | At each stage change |
Anatomy of a status page
What makes an effective public status page.
Postmortems
Post-incident communication and learning.