L7.6.1 — AI incident response playbooks¶

Type: Theory · Duration: ~5 min · Status: Mandatory Module: Module 7 — Securing the AI Pipeline (MLSecOps & Defenses) Framework tags: NIST AI RMF Manage 4.1 · EU AI Act Article 73 (incident reporting)

Learning objectives¶

Recognize three AI-specific IR scenarios and the playbook outline for each.
Identify what AI-IR adds beyond classical IR (containment options, post-incident artifacts).

Core content¶

What AI-IR is¶

AI incident response is the discipline of responding to incidents involving AI systems — model outputs causing harm, model leaking data, prompt injection succeeding in production, supply-chain compromise, etc. Same shape as classical IR (detect, contain, eradicate, recover, post-incident review) with AI-specific twists.

The "AI-specific twists" are the reason AI-IR is its own concern: - Containment options include model rollback, guardrail tightening, model disabling. - Eradication often requires retraining or fine-tuning (not just patching code). - Post-incident artifacts include eval-suite updates, AI-BOM updates, model-card disclosure decisions.

Three AI-specific IR scenarios¶

Scenario A: Prompt injection in production. A user (or attacker) reports that a deployed LLM-powered feature produced harmful, unauthorized, or sensitive output. Initial triage: reproduce; identify the injection class (direct, indirect via what vector); classify severity (data exposed? action taken? scope of affected users?). Containment: tighten input filter / output filter / structured output schema; or disable the feature. Eradication: deploy hardened guardrail; update threat model; add eval case so regression is caught next time. Communication: customer-facing notification depending on severity (L0.3-style ethics + L2.5.3 EU AI Act considerations).

Scenario B: Sensitive information disclosure. Logs surface that the model emitted PII or proprietary information (training-data extraction per L5.3.1, or system-prompt leakage per L3.5.1). Initial triage: how much was disclosed, to whom, when. Containment: disable affected endpoint or scope down output. Eradication: tighten output PII redaction; retrain or fine-tune to reduce memorization if structural; update model card to disclose. Communication: regulatory notification (GDPR, HIPAA if applicable) within mandated timelines.

Scenario C: Supply chain compromise. A used artifact (base model, adapter, dataset, library) is disclosed as compromised. Initial triage: which features use this artifact; what's the impact path; is there a working exploit. Containment: pin to known-good version; disable features if necessary. Eradication: replace artifact; re-run AI-BOM update; revalidate downstream. Communication: notify affected internal teams; potentially notify customers if disclosed artifact was in shipped product.

IR playbook template¶

A reusable shape for AI-IR playbooks:

Section 1: Trigger conditions
  - Specific telemetry signals that open this playbook
  - Severity classification rubric

Section 2: Initial triage (first 30 min)
  - Reproduction steps
  - Severity confirmation
  - Initial communication (to whom, what)

Section 3: Containment (first 2 hours)
  - Disable / scope-down options (rank by cost vs effectiveness)
  - Decision authority (who can pull the kill switch?)

Section 4: Eradication
  - Root cause categories
  - Required fixes per category
  - Acceptance criteria for "fixed"

Section 5: Recovery
  - Re-enable plan
  - Monitoring during recovery

Section 6: Post-incident
  - Mandatory artifacts: incident report, eval-suite update, AI-BOM update
  - Customer / regulator communication checklist
  - Time SLA for completion (e.g., 14 days)

The L7.9 lab walks a tabletop exercise against one of these playbooks.

What AI-IR adds beyond classical IR¶

Three things:

1. New containment options. Classical IR has limited containment short of taking systems offline. AI-IR has more graduated options: tighten guardrails, switch to fallback model, structured-output-only mode, etc. These need to be pre-built and tested.

2. Eradication that touches training. Most classical incidents are eradicated with code or config changes. Some AI incidents require retraining or fine-tuning — which means the IR timeline extends into training-cycle territory. Plan for it.

3. New post-incident artifacts. Beyond the standard incident report, AI-IR produces: eval-suite update (catch this class next time), AI-BOM update if a supply-chain component changed, possibly model-card disclosure, possibly regulatory notification under EU AI Act Article 73 for "serious incidents."

Pre-incident: practice¶

Standing tabletop exercises against AI-IR scenarios is high-ROI. The L7.9 lab includes a small tabletop format you can adapt to your organization. Most teams in 2026 don't run AI-specific tabletops; running one quarterly is a meaningful operational improvement.

Real-world example¶

The EU AI Act Article 73 requires serious-incident reporting to authorities within specified timelines for high-risk systems. Multiple early 2025 disclosures by major AI vendors triggered these reporting flows — and several published their post-incident reports. The published reports are useful templates for what good looks like.

Key terms¶

AI incident response — IR for incidents involving AI systems.
AI-specific containment — guardrail tightening, model switching, scoped-output mode.
Eradication-via-retraining — when fixing the incident requires re-training or fine-tuning.
EU AI Act Article 73 — serious-incident reporting for high-risk systems.

References¶

NIST AI RMF Manage 4.1.
EU AI Act Article 73.
Anthropic / OpenAI public post-incident reports.

Quiz items¶

Q: Name the three AI-specific IR scenarios discussed. A: Prompt injection in production; sensitive information disclosure; supply chain compromise.
Q: What does AI-IR add beyond classical IR? A: Three things: new containment options (guardrail tightening, fallback model, structured-output mode); eradication that touches training (retraining/fine-tuning); new post-incident artifacts (eval-suite update, AI-BOM update, possibly model-card disclosure and regulatory notification).
Q: Under EU AI Act Article 73, what's required and for which systems? A: Serious-incident reporting to authorities within specified timelines, for high-risk systems.

Video script (~600 words, ~4.5 min)¶

[SLIDE 1 — Title]

AI incident response playbooks. Five minutes.

[SLIDE 2 — What AI-IR is]

AI incident response is the discipline of responding to incidents involving AI systems — model outputs causing harm, model leaking data, prompt injection succeeding in production, supply-chain compromise. Same shape as classical IR — detect, contain, eradicate, recover, post-incident review — with AI-specific twists. The twists are the reason AI-IR is its own concern. Containment options include model rollback, guardrail tightening, model disabling. Eradication often requires retraining or fine-tuning, not just patching code. Post-incident artifacts include eval-suite updates, AI-BOM updates, model-card disclosure decisions.

[SLIDE 3 — Scenario A: Prompt injection in production]

Scenario A: prompt injection in production. A user or attacker reports that a deployed LLM-powered feature produced harmful, unauthorized, or sensitive output. Initial triage: reproduce, identify injection class — direct, indirect via what vector — classify severity, data exposed, action taken, scope of affected users. Containment: tighten input filter, output filter, structured output schema; or disable the feature. Eradication: deploy hardened guardrail, update threat model, add eval case so regression is caught next time. Communication: customer-facing notification depending on severity.

[SLIDE 4 — Scenario B: Sensitive information disclosure]

Scenario B: sensitive information disclosure. Logs surface that the model emitted PII or proprietary information. Training-data extraction or system-prompt leakage. Initial triage: how much, to whom, when. Containment: disable affected endpoint or scope down output. Eradication: tighten output PII redaction; retrain or fine-tune to reduce memorization if structural; update model card to disclose. Communication: regulatory notification — GDPR, HIPAA if applicable — within mandated timelines.

[SLIDE 5 — Scenario C: Supply chain compromise]

Scenario C: supply chain compromise. A used artifact — base model, adapter, dataset, library — is disclosed as compromised. Initial triage: which features use this artifact, impact path, working exploit. Containment: pin to known-good version, disable features if necessary. Eradication: replace artifact, re-run AI-BOM update, revalidate downstream. Communication: notify affected internal teams; potentially notify customers if disclosed artifact was in shipped product.

[SLIDE 6 — IR playbook template]

A reusable playbook shape. Section 1: trigger conditions — specific telemetry signals, severity classification rubric. Section 2: initial triage, first 30 min — reproduction steps, severity confirmation, initial communication. Section 3: containment, first 2 hours — disable/scope-down options ranked by cost vs effectiveness, decision authority. Section 4: eradication — root cause categories, required fixes per category, acceptance criteria for "fixed." Section 5: recovery — re-enable plan, monitoring during recovery. Section 6: post-incident — mandatory artifacts, customer/regulator communication checklist, time SLA.

The L7.9 lab walks a tabletop exercise against one of these playbooks.

[SLIDE 7 — What AI-IR adds]

Three things AI-IR adds beyond classical IR. New containment options — guardrail tightening, fallback model, scoped-output mode. Pre-built and tested. Eradication that touches training — some incidents require retraining or fine-tuning; IR timeline extends into training-cycle territory. New post-incident artifacts — eval-suite update, AI-BOM update, possibly model-card disclosure, possibly regulatory notification under EU AI Act Article 73 for serious incidents.

[SLIDE 8 — Practice + up next]

Standing tabletop exercises against AI-IR scenarios is high-ROI. The L7.9 lab includes a small tabletop format. Most teams in twenty-twenty-six don't run AI-specific tabletops. Running one quarterly is a meaningful operational improvement.

All theory done. Four labs next. See you there.

Slide outline¶

Title — "AI incident response playbooks".
What AI-IR is — classical IR loop + AI-specific overlays.
Scenario A — prompt-injection IR walk-through.
Scenario B — info-disclosure IR walk-through.
Scenario C — supply-chain IR walk-through.
Playbook template — six-section pseudocode block.
What AI-IR adds — three-card layout.
Practice + up next — tabletop callout + lab pointer.

Production notes¶

Recording: ~4.5 min. Cap 5.
Slides 3-5 should follow the same template visually for fast pattern recognition.