Skip to content

L2.6 — Threat-model the Module 1 RAG app (Lab)

Type: Lab · Duration: ~60 min · Status: Mandatory Module: Module 2 — AI Security Foundations Framework tags: STRIDE-MA · MITRE ATLAS · OWASP LLM Top 10 · NIST AI RMF Map 1.1, 5.1; Measure 2.7

Goal of the lab

Produce a complete threat model of the Asfela Handbook RAG application you built in L1.7. Deliverable: a single markdown document (threat-model.md) containing a data-flow diagram, a STRIDE-MA threat table, ATLAS technique mappings, OWASP LLM Top 10 coverage, and a top-3-risk prioritization. This artifact gates Module 3 — you'll attack the system you threat-modeled here.

Why this matters

Threat modeling separates "I know some attacks exist" from "I know which attacks apply to this specific system." Every red-team plan, every defensive control, every compliance attestation downstream is more credible when grounded in a written threat model. This lab gives you a real artifact you can use as a writing sample.

Prerequisites

  • Skills: comfort with markdown, basic diagram drawing (mermaid, text-art, or external tool).
  • Lessons: L2.1.1 through L2.5.3.
  • Environment: the RAG app from L1.7 must be runnable in your lab container (uv run python -c "from ai_sec.rag import query; print(query('What is Asfela\\'s PTO policy?'))" should return an answer).

What you'll build

  • runs/lab2_6/threat-model.md — the deliverable.
  • A data-flow diagram (DFD) of the RAG app with trust boundaries.
  • A STRIDE-MA table — at minimum 12 rows (eight letters × at least two DFD elements, sampled).
  • An ATLAS technique mapping for the top 3 risks.
  • An OWASP LLM Top 10 coverage matrix.
  • A top-3 risks list with severity, likelihood, and proposed mitigations.

Steps

Step 1 — Re-orient on the system

Make sure the RAG app is still working:

cd /workspace/ai-sec-course
uv run python -c "from ai_sec.rag import query; print(query('What is Asfela\\'s PTO policy?'))"

Expected: an answer with citations to 02-pto-policy.md. If it fails, re-run the L1.7 build step (uv run python -c "from pathlib import Path; from ai_sec.rag import load_corpus, embed; chunks=list(load_corpus(Path('corpora/asfela-handbook'))); embed(chunks)").

Step 2 — Open the threat-model template

A skeleton awaits you:

cp templates/threat-model-template.md runs/lab2_6/threat-model.md
$EDITOR runs/lab2_6/threat-model.md

The template has all section headers in place; your job is to fill the content.

Step 3 — Draw the data-flow diagram

Sketch the RAG application's data flows. At minimum, your diagram must include:

  • External actors: Learner (the asking user), Handbook author (the corpus writer)
  • Processes: RAG query handler, embedding service, retrieval, LLM (Ollama)
  • Data stores: corpus (corpora/asfela-handbook/), Chroma vector DB, prompt template
  • Data flows: user query → query handler → retrieval → LLM → response
  • Trust boundaries: at least three — user input/system, app/LLM, corpus author/corpus

Use mermaid (recommended) or text-art. A mermaid skeleton you can extend:

```mermaid
flowchart LR
    User([Learner])
    Author([Handbook author])
    QH[Query Handler]
    Emb[Embedding Service]
    Ret[Retrieval]
    LLM[(Ollama LLM)]
    Corp[(Corpus .md files)]
    DB[(Chroma DB)]
    SP[/System Prompt/]

    User -->|"question"| QH
    Author -.->|"writes docs"| Corp
    Corp -->|"build-time"| Emb -->|"vectors"| DB
    QH -->|"query"| Emb -->|"query vector"| Ret
    Ret -->|"top-k chunks"| QH
    QH -->|"prompt"| LLM -->|"answer"| QH
    QH -->|"response"| User
    SP -.->|"static"| QH

    %% Trust boundaries (annotate visually in your diagram tool)
```

Mark trust boundaries explicitly. Tip: every threat lives on an arrow crossing a boundary.

Step 4 — Apply STRIDE-MA

For each DFD element, walk STRIDE-MA. Capture at least one threat per element where the category plausibly applies. Use the template's table:

ID Element Category Threat description Impact Likelihood
T01 User → Query Handler M (Model manip.) Direct prompt injection in user query overrides system prompt High High
T02 Author → Corpus T (Tampering) Author can plant content with adversarial instructions High Medium
T03 Corpus → Embedding I (Info disclosure) Embeddings of corpus stored in Chroma can be exfiltrated and reversed Medium Low
T04 Retrieval → Query Handler M Indirect injection — retrieved chunks contain instructions High Medium
... ... ... ... ... ...

Aim for at least 12 rows. Don't force categories that don't apply — leave them out, but explain in a note why.

Step 5 — Map top risks to MITRE ATLAS

Pick the top 3 highest-impact threats from your STRIDE-MA table. For each, find the matching ATLAS technique(s):

Threat ATLAS Technique ID Technique name
T01 (direct PI) AML.T0051.000 Direct Prompt Injection
T04 (indirect PI via retrieved doc) AML.T0051.001 Indirect Prompt Injection
T02 (corpus tampering by authorized author) AML.T0070 RAG Poisoning

Look these up on https://atlas.mitre.org and copy the canonical names.

Step 6 — OWASP LLM Top 10 coverage matrix

Fill the coverage matrix in the template:

OWASP entry Present in this system? Current control? Gap?
LLM01 Prompt Injection Yes None — system prompt only Yes, severe
LLM02 Insecure Output Handling Yes None — output rendered as-is Yes
LLM03 Training Data Poisoning No (we don't train) n/a n/a
LLM04 Model DoS Yes No rate limit Yes
LLM05 Supply Chain Yes No SBOM Yes
LLM06 Sensitive Info Disclosure Yes (PII in corpus would leak) None Yes
LLM07 System Prompt Leakage Yes None Yes
LLM08 Excessive Agency No (no tools) n/a n/a
LLM09 Overreliance Yes "if uncertain, say so" instruction Partial
LLM10 Model Theft Low (small local model) n/a Low

Step 7 — Pick your top 3 risks and propose mitigations

From everything above, name the three risks you would address first if you owned this product, with: - Why this is in the top 3 (impact, likelihood, exploitability) - Proposed mitigation (specific control) - Framework citation (ATLAS technique ID, OWASP entry, NIST RMF subcategory)

Example:

Risk 1 — Indirect prompt injection via corpus The corpus is editable; an attacker (or insider) with write access can plant instructions any user query will surface. Mitigation: corpus content sanitization on ingest (strip instruction-shaped content), plus retrieval-time pattern detection. Citations: ATLAS AML.T0051.001, AML.T0070 · OWASP LLM01 · NIST AI RMF Measure 2.7 · EU AI Act Art. 15

Step 8 — Submit your threat model

Save runs/lab2_6/threat-model.md. The L2.6 grading rubric (in runs/lab2_6/rubric.md) scores on five dimensions:

  1. DFD completeness (does it show the system?)
  2. STRIDE-MA coverage (≥ 12 rows, plausible threats?)
  3. ATLAS mapping accuracy (correct technique IDs?)
  4. OWASP coverage matrix (all 10 entries assessed?)
  5. Top-3 risks reasoning (well-justified, multi-framework cited?)

Self-grade against the rubric; if you score ≥ 4 / 5 dimensions, you're done. Otherwise iterate.


What just happened (debrief)

You produced a real threat model artifact. Three things to internalize.

Threat modeling is a writing exercise, not a checklist. The artifact is the deliverable. A senior reviewer reads it linearly: DFD → STRIDE table → ATLAS mapping → top-3 risks. The clarity of that read is more important than the count of threats. A 15-row STRIDE table you can defend beats a 50-row table that nobody reads.

Framework citations compound. Notice how every top-3 risk had four framework citations (ATLAS + OWASP + NIST + EU AI Act). This isn't decorative. Each citation lets a different stakeholder accept the finding: ATLAS for the red-teamer, OWASP for the engineer, NIST for the audit team, EU AI Act for compliance. One finding, four audiences — without re-writing.

Top-3 selection is the skill. Anyone can enumerate threats. The senior move is prioritizing — choosing which three (or five, or ten) to fix first, with defensible reasoning. The capstone in Module 9 amplifies this exercise; the L2.6 deliverable is a smaller version of that same skill.

This threat model is now an input to Module 3. Every attack in Module 3 will trace back to a threat you identified here.

Extension challenges (optional)

  • Easy. Re-run STRIDE-MA on an additional DFD element you didn't cover in Step 4. Add ≥ 2 rows.
  • Medium. Add an "Agent layer" extension to your DFD: imagine the RAG app got a send_email tool. Re-run STRIDE-MA on the tool-call boundary. Identify ≥ 2 new threats with Agency-abuse (A) category.
  • Hard. Write a 1-page "AI Security Posture" summary for a fictional buyer's procurement review — translating your threat model into bidder-friendly language. Use the OWASP coverage matrix as your spine.

References

  • Shostack, Threat Modeling, esp. ch. 5 (STRIDE) and ch. 11 (deliverables).
  • MITRE ATLAS Matrix — https://atlas.mitre.org/matrices/
  • OWASP LLM Top 10 — https://owasp.org/www-project-top-10-for-large-language-model-applications/
  • NIST AI RMF Map and Measure functions — https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf

Provisioning spec (for lab platform admin, NOT shown to learner)

Container base image: aisec/labs-base:0.1

Additional pre-installed files: - /workspace/ai-sec-course/templates/threat-model-template.md — markdown skeleton with all section headers in place - /workspace/ai-sec-course/runs/lab2_6/rubric.md — five-dimension grading rubric (also used by graders) - /workspace/ai-sec-course/runs/lab2_6/.gitkeep

Editor: any text editor (vim/nano/code-server) — the lab platform should expose at least one.

Network access: outbound to atlas.mitre.org, owasp.org, nvlpubs.nist.gov (for reference lookups during the lab).

Estimated resource use: - RAM: minimal (mostly markdown + the running RAG, ~3 GB at peak) - CPU: negligible - Wallclock: 50–80 min

Grading: self-graded against the rubric in the lab. If we add a manual grading pass (paid certification track), provision a "submit for review" workflow that snapshots runs/lab2_6/threat-model.md to platform storage.

Notes for platform admin: - The templates/threat-model-template.md should ship as part of the companion repo image. Add to next provisioning iteration if missing. - This lab references L1.7's RAG state (/workspace/.cache/chroma). If the container session has been reset between modules, the lab includes a "rebuild index first" instruction.