Module 1 — AI/ML Foundations for Security Engineers¶

Duration: ~3.5 hrs · Status: Mandatory (the security-side foundations are a Module 2 prerequisite for the ML-side cohort) Lessons: 8 (5 theory · 3 labs — 2 mandatory + 1 optional) · Quiz: yes (12 questions, 70% to pass) Framework coverage: introduces the AI pipeline that every later attack/defense maps to. No single OWASP/ATLAS tag — sets the stage for all of them.

Module outcomes¶

By the end of this module, the learner can: 1. Explain in plain English what an ML model is, the difference between training and inference, and where classical ML differs from LLMs. 2. Describe the architecture of a modern LLM: tokenization, embeddings, the transformer block, and decoding strategies — at a depth sufficient to reason about attacks. 3. Map the full AI pipeline (data → training → eval → deployment → monitoring → fine-tune) and name at least one attack class per stage. 4. Run a real LLM locally and via a frontier API, and identify the trust boundaries between them. 5. Build a small Retrieval-Augmented Generation (RAG) system from scratch — the same system you'll later attack in Module 3 and defend in Module 7.

Lesson list¶

L1.1 — Machine learning in 30 minutes (Theory, ~30 min, mandatory)
L1.2 — Neural networks and deep learning (Theory, ~30 min, mandatory)
L1.3 — LLMs explained: tokens, embeddings, transformers, decoding (Theory, ~35 min, mandatory)
L1.4 — The modern AI pipeline (Theory, ~25 min, mandatory)
L1.5 — Where attacks happen at each pipeline stage (Theory, ~20 min, mandatory)
L1.6 — (Lab) Run an LLM locally vs via API; inspect a model card (~35 min, mandatory)
L1.7 — (Lab) Build a tiny RAG system from scratch (~50 min, mandatory)
L1.8 — (Lab, optional) Fine-tune a small model with LoRA (~75 min, optional)
Quiz — 12 questions, mix of multiple choice and short scenario (~10 min, mandatory)
Summary — bridge to Module 2 (~3 min, mandatory)

Why this module exists¶

Security engineers can defend HTTP without knowing how a NIC works, but they can't defend an LLM without knowing what a token is. The reason is that almost every LLM attack — prompt injection, jailbreaks, extraction, evasion, supply-chain — exploits something specific about how the model was built, trained, or deployed, not a generic protocol bug. If you don't know the parts, you can't reason about which parts are exposed.

This module is the absolute minimum AI/ML literacy required to do the rest of the course honestly. We are not training ML researchers. We are training people who can look at a chatbot architecture diagram and say, "the retrieval step is your indirect prompt-injection vector, the fine-tune dataset is your backdoor vector, and the tool-call layer is your agent-escape vector" — and back each claim with a defense.

The two mandatory labs ship the artifact that drives the rest of the course: the Module-1 RAG app. You build it here, you attack it in Module 3, you defend it in Module 7, and you sign off on a launch checklist for it in Module 8.

What's next¶

Module 2 — AI Security Foundations for ML Engineers. With the AI substrate now under your belt, Module 2 layers the security framing: threat modeling for AI, the AI attack surface, MITRE ATLAS deep dive, OWASP LLM Top 10 walk-through, and just enough NIST AI RMF + EU AI Act to inform engineering decisions.