Skip to content

Module 3 — Summary

Type: Theory · Duration: ~3 min · Status: Mandatory

Video script

[SLIDE 1 — Module 3 wrap]

Module 3 wrap. You can now execute direct and indirect prompt injection across multiple patterns, exploit insecure output handling and excessive agency, extract system prompts using three different techniques, and — most importantly — you've measured what defenses actually move the needle. The before-and-after data from L3.9 is the artifact you'll be sharpest about as an AI security engineer. "We added this defense and reduced success rate from 85 percent to 5 percent across four exploits" is the language CISOs and procurement officers care about.

[SLIDE 2 — What changes in Module 4]

Module 3 was inference-time attacks — the model behaving badly under crafted input. Module 4 is training-time and supply-chain attacks — the model itself being made bad before deployment. Three mandatory labs: poisoning a small classifier, planting a backdoor trigger, scanning a HuggingFace model for malicious pickles. One optional lab on building an AI Bill of Materials. The threat model gets bigger; the defenses get more architectural.

See you in Module 4.

Slide outline

  1. Module 3 wrap — five-checkmark recap: Direct PI · Indirect PI · Insecure output · Excessive agency · System-prompt extraction · plus the defense-measurement artifact called out as the takeaway artifact.
  2. What's next — Module 4 teaser; pivot from inference-time to training-time/supply-chain.

Production notes

  • Recording: 2–3 min raw.
  • Same "Module N → Module N+1" visual convention.