Communicating During Incidents

Clear communication during incidents keeps stakeholders informed and reduces panic. Establish templates, channels, and update cadences in advance.

Internal communication

Dedicated incident channel

Create a dedicated channel (Slack, Teams) for each significant incident:

Name it clearly: #incident-2025-03-15-api-outage
Pin the current status and timeline
Keep discussion focused — off-topic conversation goes elsewhere

Regular status updates

Post updates at fixed intervals, even if nothing has changed:

14:30 update: Still investigating. Database team is looking at connection pool exhaustion. No ETA yet. Next update in 15 minutes.

“No update” is worse than “still working on it.” Silence breeds anxiety.

What to communicate internally

Timing	Content
Incident opened	What’s broken, severity, who’s responding
Every 15–30 min	Current status, what’s been tried, what’s next
Mitigation applied	What was done, is it working, risk of recurrence
Resolved	Confirmation, duration, brief root cause
Post-incident	Postmortem link, action items

External communication

Status page updates

Your status page is the primary external communication channel. Update it at every stage: Investigating:

We are investigating reports of elevated API response times. Some requests may be slower than usual.

Identified:

We have identified the cause of elevated API response times and are implementing a fix. API functionality is not affected.

Monitoring:

A fix has been applied and we are monitoring the results. Response times are returning to normal.

Resolved:

The issue causing elevated API response times has been resolved. All systems are operating normally.

Principles for external communication

Acknowledge quickly — even before you know the cause, acknowledge the problem
Be honest — don’t minimize or hide impact. Users trust transparency.
Use plain language — “Some API requests are failing” not “We’re experiencing degraded throughput on our ingestion pipeline”
Give timelines — “We expect this to be resolved within 2 hours” (and update if it changes)
Don’t speculate — share what you know, not what you think might be happening

What NOT to communicate

Internal implementation details
Blame or individual names
Speculative root causes
Security vulnerability details (handle through security disclosure process)

Communication templates

Prepare templates before incidents happen:

Initial acknowledgement

We are aware of [brief description of issue]. Our team is investigating. We will provide updates every [interval]. Current status: [Investigating/Identified].

Progress update

Update on [issue]: We have identified [brief cause] and are [action being taken]. Expected resolution: [time estimate or “no ETA yet”]. Next update in [interval].

Resolution

[Issue] has been resolved. [Brief explanation of what happened and fix]. Duration: [X hours/minutes]. We apologize for the inconvenience and will publish a full post-incident report.

Choosing communication channels

Audience	Channel	Cadence
Incident responders	Dedicated Slack channel	Real-time
Engineering team	Broader Slack channel	Every 30 min
Company leadership	Email or Slack summary	Hourly for SEV-1
Customers	Status page	Every 30 min
Support team	Internal FAQ doc	At each stage change

Anatomy of a status page

What makes an effective public status page.

Postmortems

Post-incident communication and learning.

Documentation Index

​Internal communication

​Dedicated incident channel

​Regular status updates

​What to communicate internally

​External communication

​Status page updates

​Principles for external communication

​What NOT to communicate

​Communication templates

​Initial acknowledgement

​Progress update

​Resolution

​Choosing communication channels