L2.6 — Threat-model the Module 1 RAG app (Lab)¶
Type: Lab · Duration: ~60 min · Status: Mandatory Module: Module 2 — AI Security Foundations Framework tags: STRIDE-MA · MITRE ATLAS · OWASP LLM Top 10 · NIST AI RMF Map 1.1, 5.1; Measure 2.7
Goal of the lab¶
Produce a complete threat model of the Asfela Handbook RAG application you built in L1.7. Deliverable: a single markdown document (threat-model.md) containing a data-flow diagram, a STRIDE-MA threat table, ATLAS technique mappings, OWASP LLM Top 10 coverage, and a top-3-risk prioritization. This artifact gates Module 3 — you'll attack the system you threat-modeled here.
Why this matters¶
Threat modeling separates "I know some attacks exist" from "I know which attacks apply to this specific system." Every red-team plan, every defensive control, every compliance attestation downstream is more credible when grounded in a written threat model. This lab gives you a real artifact you can use as a writing sample.
Prerequisites¶
- Skills: comfort with markdown, basic diagram drawing (mermaid, text-art, or external tool).
- Lessons: L2.1.1 through L2.5.3.
- Environment: the RAG app from L1.7 must be runnable in your lab container (
uv run python -c "from ai_sec.rag import query; print(query('What is Asfela\\'s PTO policy?'))"should return an answer).
What you'll build¶
runs/lab2_6/threat-model.md— the deliverable.- A data-flow diagram (DFD) of the RAG app with trust boundaries.
- A STRIDE-MA table — at minimum 12 rows (eight letters × at least two DFD elements, sampled).
- An ATLAS technique mapping for the top 3 risks.
- An OWASP LLM Top 10 coverage matrix.
- A top-3 risks list with severity, likelihood, and proposed mitigations.
Steps¶
Step 1 — Re-orient on the system¶
Make sure the RAG app is still working:
cd /workspace/ai-sec-course
uv run python -c "from ai_sec.rag import query; print(query('What is Asfela\\'s PTO policy?'))"
Expected: an answer with citations to 02-pto-policy.md. If it fails, re-run the L1.7 build step (uv run python -c "from pathlib import Path; from ai_sec.rag import load_corpus, embed; chunks=list(load_corpus(Path('corpora/asfela-handbook'))); embed(chunks)").
Step 2 — Open the threat-model template¶
A skeleton awaits you:
cp templates/threat-model-template.md runs/lab2_6/threat-model.md
$EDITOR runs/lab2_6/threat-model.md
The template has all section headers in place; your job is to fill the content.
Step 3 — Draw the data-flow diagram¶
Sketch the RAG application's data flows. At minimum, your diagram must include:
- External actors: Learner (the asking user), Handbook author (the corpus writer)
- Processes: RAG query handler, embedding service, retrieval, LLM (Ollama)
- Data stores: corpus (
corpora/asfela-handbook/), Chroma vector DB, prompt template - Data flows: user query → query handler → retrieval → LLM → response
- Trust boundaries: at least three — user input/system, app/LLM, corpus author/corpus
Use mermaid (recommended) or text-art. A mermaid skeleton you can extend:
```mermaid
flowchart LR
User([Learner])
Author([Handbook author])
QH[Query Handler]
Emb[Embedding Service]
Ret[Retrieval]
LLM[(Ollama LLM)]
Corp[(Corpus .md files)]
DB[(Chroma DB)]
SP[/System Prompt/]
User -->|"question"| QH
Author -.->|"writes docs"| Corp
Corp -->|"build-time"| Emb -->|"vectors"| DB
QH -->|"query"| Emb -->|"query vector"| Ret
Ret -->|"top-k chunks"| QH
QH -->|"prompt"| LLM -->|"answer"| QH
QH -->|"response"| User
SP -.->|"static"| QH
%% Trust boundaries (annotate visually in your diagram tool)
```
Mark trust boundaries explicitly. Tip: every threat lives on an arrow crossing a boundary.
Step 4 — Apply STRIDE-MA¶
For each DFD element, walk STRIDE-MA. Capture at least one threat per element where the category plausibly applies. Use the template's table:
| ID | Element | Category | Threat description | Impact | Likelihood |
|---|---|---|---|---|---|
| T01 | User → Query Handler | M (Model manip.) | Direct prompt injection in user query overrides system prompt | High | High |
| T02 | Author → Corpus | T (Tampering) | Author can plant content with adversarial instructions | High | Medium |
| T03 | Corpus → Embedding | I (Info disclosure) | Embeddings of corpus stored in Chroma can be exfiltrated and reversed | Medium | Low |
| T04 | Retrieval → Query Handler | M | Indirect injection — retrieved chunks contain instructions | High | Medium |
| ... | ... | ... | ... | ... | ... |
Aim for at least 12 rows. Don't force categories that don't apply — leave them out, but explain in a note why.
Step 5 — Map top risks to MITRE ATLAS¶
Pick the top 3 highest-impact threats from your STRIDE-MA table. For each, find the matching ATLAS technique(s):
| Threat | ATLAS Technique ID | Technique name |
|---|---|---|
| T01 (direct PI) | AML.T0051.000 | Direct Prompt Injection |
| T04 (indirect PI via retrieved doc) | AML.T0051.001 | Indirect Prompt Injection |
| T02 (corpus tampering by authorized author) | AML.T0070 | RAG Poisoning |
Look these up on https://atlas.mitre.org and copy the canonical names.
Step 6 — OWASP LLM Top 10 coverage matrix¶
Fill the coverage matrix in the template:
| OWASP entry | Present in this system? | Current control? | Gap? |
|---|---|---|---|
| LLM01 Prompt Injection | Yes | None — system prompt only | Yes, severe |
| LLM02 Insecure Output Handling | Yes | None — output rendered as-is | Yes |
| LLM03 Training Data Poisoning | No (we don't train) | n/a | n/a |
| LLM04 Model DoS | Yes | No rate limit | Yes |
| LLM05 Supply Chain | Yes | No SBOM | Yes |
| LLM06 Sensitive Info Disclosure | Yes (PII in corpus would leak) | None | Yes |
| LLM07 System Prompt Leakage | Yes | None | Yes |
| LLM08 Excessive Agency | No (no tools) | n/a | n/a |
| LLM09 Overreliance | Yes | "if uncertain, say so" instruction | Partial |
| LLM10 Model Theft | Low (small local model) | n/a | Low |
Step 7 — Pick your top 3 risks and propose mitigations¶
From everything above, name the three risks you would address first if you owned this product, with: - Why this is in the top 3 (impact, likelihood, exploitability) - Proposed mitigation (specific control) - Framework citation (ATLAS technique ID, OWASP entry, NIST RMF subcategory)
Example:
Risk 1 — Indirect prompt injection via corpus The corpus is editable; an attacker (or insider) with write access can plant instructions any user query will surface. Mitigation: corpus content sanitization on ingest (strip instruction-shaped content), plus retrieval-time pattern detection. Citations: ATLAS AML.T0051.001, AML.T0070 · OWASP LLM01 · NIST AI RMF Measure 2.7 · EU AI Act Art. 15
Step 8 — Submit your threat model¶
Save runs/lab2_6/threat-model.md. The L2.6 grading rubric (in runs/lab2_6/rubric.md) scores on five dimensions:
- DFD completeness (does it show the system?)
- STRIDE-MA coverage (≥ 12 rows, plausible threats?)
- ATLAS mapping accuracy (correct technique IDs?)
- OWASP coverage matrix (all 10 entries assessed?)
- Top-3 risks reasoning (well-justified, multi-framework cited?)
Self-grade against the rubric; if you score ≥ 4 / 5 dimensions, you're done. Otherwise iterate.
What just happened (debrief)¶
You produced a real threat model artifact. Three things to internalize.
Threat modeling is a writing exercise, not a checklist. The artifact is the deliverable. A senior reviewer reads it linearly: DFD → STRIDE table → ATLAS mapping → top-3 risks. The clarity of that read is more important than the count of threats. A 15-row STRIDE table you can defend beats a 50-row table that nobody reads.
Framework citations compound. Notice how every top-3 risk had four framework citations (ATLAS + OWASP + NIST + EU AI Act). This isn't decorative. Each citation lets a different stakeholder accept the finding: ATLAS for the red-teamer, OWASP for the engineer, NIST for the audit team, EU AI Act for compliance. One finding, four audiences — without re-writing.
Top-3 selection is the skill. Anyone can enumerate threats. The senior move is prioritizing — choosing which three (or five, or ten) to fix first, with defensible reasoning. The capstone in Module 9 amplifies this exercise; the L2.6 deliverable is a smaller version of that same skill.
This threat model is now an input to Module 3. Every attack in Module 3 will trace back to a threat you identified here.
Extension challenges (optional)¶
- Easy. Re-run STRIDE-MA on an additional DFD element you didn't cover in Step 4. Add ≥ 2 rows.
- Medium. Add an "Agent layer" extension to your DFD: imagine the RAG app got a
send_emailtool. Re-run STRIDE-MA on the tool-call boundary. Identify ≥ 2 new threats with Agency-abuse (A) category. - Hard. Write a 1-page "AI Security Posture" summary for a fictional buyer's procurement review — translating your threat model into bidder-friendly language. Use the OWASP coverage matrix as your spine.
References¶
- Shostack, Threat Modeling, esp. ch. 5 (STRIDE) and ch. 11 (deliverables).
- MITRE ATLAS Matrix — https://atlas.mitre.org/matrices/
- OWASP LLM Top 10 — https://owasp.org/www-project-top-10-for-large-language-model-applications/
- NIST AI RMF Map and Measure functions — https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf
Provisioning spec (for lab platform admin, NOT shown to learner)¶
Container base image: aisec/labs-base:0.1
Additional pre-installed files:
- /workspace/ai-sec-course/templates/threat-model-template.md — markdown skeleton with all section headers in place
- /workspace/ai-sec-course/runs/lab2_6/rubric.md — five-dimension grading rubric (also used by graders)
- /workspace/ai-sec-course/runs/lab2_6/.gitkeep
Editor: any text editor (vim/nano/code-server) — the lab platform should expose at least one.
Network access: outbound to atlas.mitre.org, owasp.org, nvlpubs.nist.gov (for reference lookups during the lab).
Estimated resource use: - RAM: minimal (mostly markdown + the running RAG, ~3 GB at peak) - CPU: negligible - Wallclock: 50–80 min
Grading: self-graded against the rubric in the lab. If we add a manual grading pass (paid certification track), provision a "submit for review" workflow that snapshots runs/lab2_6/threat-model.md to platform storage.
Notes for platform admin:
- The templates/threat-model-template.md should ship as part of the companion repo image. Add to next provisioning iteration if missing.
- This lab references L1.7's RAG state (/workspace/.cache/chroma). If the container session has been reset between modules, the lab includes a "rebuild index first" instruction.