L7.6.1 — AI incident response playbooks¶
Type: Theory · Duration: ~5 min · Status: Mandatory Module: Module 7 — Securing the AI Pipeline (MLSecOps & Defenses) Framework tags: NIST AI RMF Manage 4.1 · EU AI Act Article 73 (incident reporting)
Learning objectives¶
- Recognize three AI-specific IR scenarios and the playbook outline for each.
- Identify what AI-IR adds beyond classical IR (containment options, post-incident artifacts).
Core content¶
What AI-IR is¶
AI incident response is the discipline of responding to incidents involving AI systems — model outputs causing harm, model leaking data, prompt injection succeeding in production, supply-chain compromise, etc. Same shape as classical IR (detect, contain, eradicate, recover, post-incident review) with AI-specific twists.
The "AI-specific twists" are the reason AI-IR is its own concern: - Containment options include model rollback, guardrail tightening, model disabling. - Eradication often requires retraining or fine-tuning (not just patching code). - Post-incident artifacts include eval-suite updates, AI-BOM updates, model-card disclosure decisions.
Three AI-specific IR scenarios¶
Scenario A: Prompt injection in production. A user (or attacker) reports that a deployed LLM-powered feature produced harmful, unauthorized, or sensitive output. Initial triage: reproduce; identify the injection class (direct, indirect via what vector); classify severity (data exposed? action taken? scope of affected users?). Containment: tighten input filter / output filter / structured output schema; or disable the feature. Eradication: deploy hardened guardrail; update threat model; add eval case so regression is caught next time. Communication: customer-facing notification depending on severity (L0.3-style ethics + L2.5.3 EU AI Act considerations).
Scenario B: Sensitive information disclosure. Logs surface that the model emitted PII or proprietary information (training-data extraction per L5.3.1, or system-prompt leakage per L3.5.1). Initial triage: how much was disclosed, to whom, when. Containment: disable affected endpoint or scope down output. Eradication: tighten output PII redaction; retrain or fine-tune to reduce memorization if structural; update model card to disclose. Communication: regulatory notification (GDPR, HIPAA if applicable) within mandated timelines.
Scenario C: Supply chain compromise. A used artifact (base model, adapter, dataset, library) is disclosed as compromised. Initial triage: which features use this artifact; what's the impact path; is there a working exploit. Containment: pin to known-good version; disable features if necessary. Eradication: replace artifact; re-run AI-BOM update; revalidate downstream. Communication: notify affected internal teams; potentially notify customers if disclosed artifact was in shipped product.
IR playbook template¶
A reusable shape for AI-IR playbooks:
Section 1: Trigger conditions
- Specific telemetry signals that open this playbook
- Severity classification rubric
Section 2: Initial triage (first 30 min)
- Reproduction steps
- Severity confirmation
- Initial communication (to whom, what)
Section 3: Containment (first 2 hours)
- Disable / scope-down options (rank by cost vs effectiveness)
- Decision authority (who can pull the kill switch?)
Section 4: Eradication
- Root cause categories
- Required fixes per category
- Acceptance criteria for "fixed"
Section 5: Recovery
- Re-enable plan
- Monitoring during recovery
Section 6: Post-incident
- Mandatory artifacts: incident report, eval-suite update, AI-BOM update
- Customer / regulator communication checklist
- Time SLA for completion (e.g., 14 days)
The L7.9 lab walks a tabletop exercise against one of these playbooks.
What AI-IR adds beyond classical IR¶
Three things:
1. New containment options. Classical IR has limited containment short of taking systems offline. AI-IR has more graduated options: tighten guardrails, switch to fallback model, structured-output-only mode, etc. These need to be pre-built and tested.
2. Eradication that touches training. Most classical incidents are eradicated with code or config changes. Some AI incidents require retraining or fine-tuning — which means the IR timeline extends into training-cycle territory. Plan for it.
3. New post-incident artifacts. Beyond the standard incident report, AI-IR produces: eval-suite update (catch this class next time), AI-BOM update if a supply-chain component changed, possibly model-card disclosure, possibly regulatory notification under EU AI Act Article 73 for "serious incidents."
Pre-incident: practice¶
Standing tabletop exercises against AI-IR scenarios is high-ROI. The L7.9 lab includes a small tabletop format you can adapt to your organization. Most teams in 2026 don't run AI-specific tabletops; running one quarterly is a meaningful operational improvement.
Real-world example¶
The EU AI Act Article 73 requires serious-incident reporting to authorities within specified timelines for high-risk systems. Multiple early 2025 disclosures by major AI vendors triggered these reporting flows — and several published their post-incident reports. The published reports are useful templates for what good looks like.
Key terms¶
- AI incident response — IR for incidents involving AI systems.
- AI-specific containment — guardrail tightening, model switching, scoped-output mode.
- Eradication-via-retraining — when fixing the incident requires re-training or fine-tuning.
- EU AI Act Article 73 — serious-incident reporting for high-risk systems.
References¶
- NIST AI RMF Manage 4.1.
- EU AI Act Article 73.
- Anthropic / OpenAI public post-incident reports.
Quiz items¶
- Q: Name the three AI-specific IR scenarios discussed. A: Prompt injection in production; sensitive information disclosure; supply chain compromise.
- Q: What does AI-IR add beyond classical IR? A: Three things: new containment options (guardrail tightening, fallback model, structured-output mode); eradication that touches training (retraining/fine-tuning); new post-incident artifacts (eval-suite update, AI-BOM update, possibly model-card disclosure and regulatory notification).
- Q: Under EU AI Act Article 73, what's required and for which systems? A: Serious-incident reporting to authorities within specified timelines, for high-risk systems.
Video script (~600 words, ~4.5 min)¶
[SLIDE 1 — Title]
AI incident response playbooks. Five minutes.
[SLIDE 2 — What AI-IR is]
AI incident response is the discipline of responding to incidents involving AI systems — model outputs causing harm, model leaking data, prompt injection succeeding in production, supply-chain compromise. Same shape as classical IR — detect, contain, eradicate, recover, post-incident review — with AI-specific twists. The twists are the reason AI-IR is its own concern. Containment options include model rollback, guardrail tightening, model disabling. Eradication often requires retraining or fine-tuning, not just patching code. Post-incident artifacts include eval-suite updates, AI-BOM updates, model-card disclosure decisions.
[SLIDE 3 — Scenario A: Prompt injection in production]
Scenario A: prompt injection in production. A user or attacker reports that a deployed LLM-powered feature produced harmful, unauthorized, or sensitive output. Initial triage: reproduce, identify injection class — direct, indirect via what vector — classify severity, data exposed, action taken, scope of affected users. Containment: tighten input filter, output filter, structured output schema; or disable the feature. Eradication: deploy hardened guardrail, update threat model, add eval case so regression is caught next time. Communication: customer-facing notification depending on severity.
[SLIDE 4 — Scenario B: Sensitive information disclosure]
Scenario B: sensitive information disclosure. Logs surface that the model emitted PII or proprietary information. Training-data extraction or system-prompt leakage. Initial triage: how much, to whom, when. Containment: disable affected endpoint or scope down output. Eradication: tighten output PII redaction; retrain or fine-tune to reduce memorization if structural; update model card to disclose. Communication: regulatory notification — GDPR, HIPAA if applicable — within mandated timelines.
[SLIDE 5 — Scenario C: Supply chain compromise]
Scenario C: supply chain compromise. A used artifact — base model, adapter, dataset, library — is disclosed as compromised. Initial triage: which features use this artifact, impact path, working exploit. Containment: pin to known-good version, disable features if necessary. Eradication: replace artifact, re-run AI-BOM update, revalidate downstream. Communication: notify affected internal teams; potentially notify customers if disclosed artifact was in shipped product.
[SLIDE 6 — IR playbook template]
A reusable playbook shape. Section 1: trigger conditions — specific telemetry signals, severity classification rubric. Section 2: initial triage, first 30 min — reproduction steps, severity confirmation, initial communication. Section 3: containment, first 2 hours — disable/scope-down options ranked by cost vs effectiveness, decision authority. Section 4: eradication — root cause categories, required fixes per category, acceptance criteria for "fixed." Section 5: recovery — re-enable plan, monitoring during recovery. Section 6: post-incident — mandatory artifacts, customer/regulator communication checklist, time SLA.
The L7.9 lab walks a tabletop exercise against one of these playbooks.
[SLIDE 7 — What AI-IR adds]
Three things AI-IR adds beyond classical IR. New containment options — guardrail tightening, fallback model, scoped-output mode. Pre-built and tested. Eradication that touches training — some incidents require retraining or fine-tuning; IR timeline extends into training-cycle territory. New post-incident artifacts — eval-suite update, AI-BOM update, possibly model-card disclosure, possibly regulatory notification under EU AI Act Article 73 for serious incidents.
[SLIDE 8 — Practice + up next]
Standing tabletop exercises against AI-IR scenarios is high-ROI. The L7.9 lab includes a small tabletop format. Most teams in twenty-twenty-six don't run AI-specific tabletops. Running one quarterly is a meaningful operational improvement.
All theory done. Four labs next. See you there.
Slide outline¶
- Title — "AI incident response playbooks".
- What AI-IR is — classical IR loop + AI-specific overlays.
- Scenario A — prompt-injection IR walk-through.
- Scenario B — info-disclosure IR walk-through.
- Scenario C — supply-chain IR walk-through.
- Playbook template — six-section pseudocode block.
- What AI-IR adds — three-card layout.
- Practice + up next — tabletop callout + lab pointer.
Production notes¶
- Recording: ~4.5 min. Cap 5.
- Slides 3-5 should follow the same template visually for fast pattern recognition.