🚨 How you handle crises defines you more than how you handle calm

PM Crisis Management
(2026 Edition)

5 first-60-min moves, 5 hours 1-4 moves, 5 post-incident steps, and 5 communication rules for handling production incidents.

Build PM Incident Instincts Daily — Free →

First 60 Minutes

1.

Assess severity (P0: users blocked, P1: degraded, P2: minor)

2.

Assemble the right team in one chat (eng lead, ops, CS, comms)

3.

Assign a single incident commander — not you unless necessary

4.

Communicate early: 'We're aware, investigating, will update in 30 min'

5.

Start a log of decisions and timestamps in the incident channel

Hours 1–4

1.

Rollback if the incident is caused by a deploy — fast recovery beats root cause hunting

2.

Keep communication loop open with customer-facing teams — every 30 min

3.

Don't speculate publicly about cause — 'investigating' is better than a wrong hypothesis

4.

Protect engineers from stakeholder interruptions — they're fixing, you're communicating

5.

Track user impact if possible (affected users, revenue blocked) — data for post-mortem

Post-Incident (24–72 hours)

1.

Resolve the incident and confirm recovery with real user data

2.

Communicate to customers — what happened, what we did, what we'll prevent

3.

Run a blameless post-mortem within 48 hours

4.

Assign prevention owners and deadlines — specific, not generic

5.

Share post-mortem broadly — signals you learned, not hid

5 Communication Rules

1.

Acknowledge first, diagnose second — users want to know you know

2.

Be specific about impact, vague about root cause (until certain)

3.

Set expectations: 'next update in 30 min' — then hit that mark

4.

Apologise if you caused it, don't over-apologise — professionalism matters

5.

Never blame individuals publicly — blame systems, fix systems

FAQ

What's the PM's role during a production incident?

Communication and coordination — NOT engineering. Your job: assemble the team, ensure the right people are engaged, communicate to affected parties (customers, leadership, CS), track decisions. Engineering owns the fix. PMs who try to engineer the fix themselves get in the way. The discipline is staying in your lane while providing air cover.

How do PMs rebuild trust after a major incident?

Three things: (1) thorough, honest post-mortem shared publicly, (2) concrete prevention measures shipped within 30 days, (3) consistent behaviour going forward — no repeat of the same mistake. Trust lost in 1 incident takes 6–12 months of consistent reliability to fully rebuild. The PMs who handle incidents well often come out stronger than before — the incident becomes evidence of their judgment under pressure.

Train PM Judgment Under Pressure Daily

Daily scenarios on hard calls, fast decisions, and communicating under stress.

Start Free Trial →