L3.2.2 — Indirect-PI delivery vectors and why they dominate¶

Type: Theory · Duration: ~5 min · Status: Mandatory Module: Module 3 — Prompt Injection & LLM Application Attacks Framework tags: OWASP LLM01 · MITRE ATLAS AML.T0051.001, AML.T0070

Learning objectives¶

Recall six common indirect-PI delivery vectors and a real-world incident or proof-of-concept for each.
Identify which vector(s) apply to a given AI system's architecture.

Core content¶

Same architectural pattern (L3.2.1), six different ways the payload reaches the model. Each one corresponds to a class of LLM-using system and a class of writer.

The six vectors¶

1. RAG corpus poisoning. The payload lives in a document stored in your retrieval corpus (vector DB + source files). Triggered when the asking user's query retrieves that document. - Writer population: anyone who can write to the corpus (often a much larger set than you assume). - Real-world: Lab L3.7 reproduces this against the Asfela handbook RAG.

2. Email injection (assistant-style apps). The payload is in an email body / subject / sender field. Triggered when the user asks an LLM assistant to summarize, draft a reply to, or take an action based on email. - Writer population: literally anyone with the user's email address. - Real-world: EchoLeak (M365 Copilot, 2025).

3. Web-content injection (agentic browsing). The payload is on a web page. Triggered when an agent browses to that page as part of a task ("research X for me"). - Writer population: anyone who can publish web content the agent might reach. - Real-world: Multiple 2024 disclosures against ChatGPT's Browsing tool, Perplexity, and agentic shopping assistants.

4. Document upload injection. The payload is in a PDF/DOCX/image the user uploads. Triggered when the LLM processes the upload (summarize, extract, answer-from). - Writer population: anyone the user accepts uploads from. Often: themselves (uploading a malicious doc they were socially-engineered to download), customers, vendors, contracted reviewers. - Real-world: Numerous proofs-of-concept; common penetration-test finding 2024-2026.

5. Tool-output injection. The payload is in the output of a tool the agent calls — a third-party API response, an SQL result, a file's contents read by a read_file tool. - Writer population: anyone who controls the data behind that tool. Vendors, peer users, public APIs. - Real-world: Documented attacks against agents that read GitHub issues, Slack messages, Notion pages as part of their workflow.

6. Code-comment / repository injection (developer-assistant apps). The payload is in source code comments, README files, or commit messages. Triggered when an LLM coding assistant or code-summarizer reads the repository. - Writer population: anyone who can land code in the repo (open-source contributors, internal devs, dependency authors transitively). - Real-world: Several 2024-2025 disclosures against Copilot-style code assistants reading malicious comments in checked-in code.

How to apply this map¶

For any AI system you threat-model, walk the six vectors. For each, ask: - "Does this system feed content from this vector into the model's context?" - "If yes, who is the writer population?" - "What's the worst payload someone in that population could plant?"

The answer to "who is the writer population" is almost always larger than the application team thinks. The default mental model is "trusted users only" — the reality is usually "anyone with email access" or "anyone who can publish a webpage" or "anyone who can land a PR."

Why indirect dominates¶

Direct PI's defenders have an easier job — they can rate-limit, ban, and apply input filters at the user channel. Indirect PI's defenders face writer populations they don't fully know and content streams they don't fully control. Plus, indirect PI exploits trust: the asking user is innocent, so user-account-based defenses don't apply.

Most production AI incidents in 2025–2026 — by count — are indirect. Direct PI is more talked-about (it's easier to demo on stage); indirect PI is more exploited in the wild.

Real-world example¶

Walking the six vectors against M365 Copilot circa 2025: (1) RAG poisoning of tenant SharePoint, (2) email injection — EchoLeak — yes, (3) web-content via the connector ecosystem if enabled, (4) document upload via files the user opens for summarization, (5) tool-output via any of the Microsoft Graph endpoints Copilot can call, (6) code-comment if Copilot in IDE mode is in scope. Six potential vectors in a single product. Each one needs its own defense. None of them goes away by patching any other.

Key terms¶

Writer population — the set of people who can write to a given vector's data source.
Vector-by-vector threat modeling — applying the six-vector walk to each system.

References¶

Greshake et al., "Not what you've signed up for" — the foundational paper that catalogues most of these vectors.
Aim Security EchoLeak disclosure.
Johann Rehberger's blog (embracethered.com) — extensive walk-throughs of vectors 3, 4, 5.

Quiz items¶

Q: Name three indirect-PI delivery vectors and one real-world incident or proof-of-concept for each. A: Any three of the six with a credible incident/POC. RAG (Lab L3.7), email (EchoLeak), web-content (ChatGPT Browsing disclosures), document upload (common pentest finding), tool-output (agents reading Slack/Notion/GitHub), code-comment (Copilot reading malicious code comments).
Q: You're scoping the indirect-PI surface for a Copilot-style coding assistant. Which two vectors should you focus on first? A: Code-comment / repository injection (vector 6) and tool-output injection (vector 5, if it reads from issue trackers / docs).

Video script (~620 words, ~4.5 min)¶

[SLIDE 1 — Title]

Indirect prompt injection delivery vectors and why they dominate. Five minutes. Same architectural pattern from last lesson. Six different ways the payload reaches the model.

[SLIDE 2 — Vector 1: RAG corpus]

Vector one. RAG corpus poisoning. The payload lives in a document stored in your retrieval corpus — vector DB plus source files. Triggered when the asking user's query retrieves that document. Writer population: anyone who can write to the corpus. Often a much larger set than you assume. Real-world: Lab L3.7 reproduces this against the Asfela handbook RAG.

[SLIDE 3 — Vector 2: email]

Vector two. Email injection in assistant-style apps. The payload is in an email body, subject, or sender field. Triggered when the user asks an LLM assistant to summarize, draft a reply to, or take action based on email. Writer population: literally anyone with the user's email address. Real-world: EchoLeak.

[SLIDE 4 — Vector 3: web content]

Vector three. Web-content injection in agentic browsing. The payload is on a web page. Triggered when an agent browses to that page as part of a task — "research X for me." Writer population: anyone who can publish web content the agent might reach. Real-world: multiple 2024 disclosures against ChatGPT's Browsing tool, Perplexity, and agentic shopping assistants.

[SLIDE 5 — Vector 4: document upload]

Vector four. Document upload injection. The payload is in a PDF, DOCX, image the user uploads. Triggered when the LLM processes the upload — summarize, extract, answer-from. Writer population: anyone the user accepts uploads from. Often: themselves, uploading a malicious doc they were socially-engineered to download. Or customers, vendors, contracted reviewers. Common penetration-test finding 2024 through 2026.

[SLIDE 6 — Vector 5: tool-output]

Vector five. Tool-output injection. The payload is in the output of a tool the agent calls — a third-party API response, an SQL result, a file's contents read by a read-file tool. Writer population: anyone who controls the data behind that tool. Vendors, peer users, public APIs. Real-world: documented attacks against agents that read GitHub issues, Slack messages, Notion pages as part of their workflow.

[SLIDE 7 — Vector 6: code-comment]

Vector six. Code-comment injection in developer-assistant apps. The payload is in source code comments, README files, commit messages. Triggered when an LLM coding assistant or code-summarizer reads the repository. Writer population: anyone who can land code in the repo. Open-source contributors. Internal devs. Dependency authors transitively. Real-world: several 2024-2025 disclosures against Copilot-style code assistants.

[SLIDE 8 — How to apply]

How to apply this map. For any AI system you threat-model, walk the six vectors. For each, ask: does this system feed content from this vector into the model's context? If yes, who is the writer population? What's the worst payload someone in that population could plant? The answer to "who is the writer population" is almost always larger than the application team thinks. Default mental model: "trusted users only." Reality: "anyone with email access" or "anyone who can publish a webpage" or "anyone who can land a PR."

[SLIDE 9 — Why indirect dominates]

Why indirect dominates. Direct PI defenders can rate-limit, ban, apply input filters at the user channel. Indirect PI defenders face writer populations they don't fully know and content streams they don't fully control. Plus, indirect PI exploits trust — the asking user is innocent. Most production AI incidents 2025-2026, by count, are indirect. Direct is more talked-about. Indirect is more exploited.

[SLIDE 10 — Up next]

Insecure output handling next. Five minutes. See you there.

Slide outline¶

Title — "Indirect-PI delivery vectors and why they dominate".
Vector 1: RAG — small architecture sketch + 1-line incident.
Vector 2: email — same shape.
Vector 3: web content — same shape.
Vector 4: document upload — same shape.
Vector 5: tool-output — same shape.
Vector 6: code-comment — same shape.
How to apply — three-question checklist.
Why indirect dominates — comparison: direct vs indirect defender's job.
Up next — "L3.3.1 — Output handling, ~5 min."

Production notes¶

Recording: ~4.5 min. Cap 5.
Slides 2-7 should follow the same template visually for fast learner pattern-recognition.