Skip to content

L4.5.1 — Dependency risk in AI stacks

Type: Theory · Duration: ~4 min · Status: Mandatory Module: Module 4 — Data Poisoning, Backdoors & Supply Chain Framework tags: OWASP LLM05 · MITRE ATLAS AML.T0010

Learning objectives

  1. Identify five high-leverage software dependency categories in a typical 2026 AI stack.
  2. Apply three classical-SCA practices (CVE scanning, pinning, source restriction) to the AI dependency tree.

Core content

The 2026 AI dependency tree

A typical LLM-using application in 2026 pulls in dozens of dependencies. Five categories worth distinguishing because each carries distinct risk:

1. LLM orchestration frameworks. LangChain, LlamaIndex, Semantic Kernel, Haystack, autogen. These wrap prompt templating, tool calling, agent loops, and retrieval. The most-CVE'd category in 2025–2026 because they evaluate user-supplied chain definitions and template strings. CVE-2023-29374 and successors against LangChain are the canonical reference.

2. Vector databases. Chroma, Pinecone (SDK), Weaviate, Qdrant, pgvector, Milvus. New infrastructure category, less mature than relational DBs. Auth gaps, isolation weaknesses, and SQL-injection-equivalents in metadata filters all surfaced in 2024–2025.

3. Embedding-model libraries. sentence-transformers, OpenAI/Anthropic/Cohere SDKs, custom HF integrations. The libraries themselves are usually safe; the models they pull at runtime are the supply-chain risk (L4.4.*).

4. Inference runtimes. transformers (HuggingFace), vllm, llama.cpp, ollama, triton. Generally safer than the orchestration layer but with their own CVE history. Particular attention to runtimes that execute model-provided code (some agent inference modes).

5. Adjacent ML tooling. numpy, torch, tensorflow, training libraries, evaluation harnesses (promptfoo, garak, pytest-evals). Classical CVE management applies; some categories (data loaders, custom CUDA kernels) have outsized risk.

Three classical-SCA practices applied to AI

Software Composition Analysis (SCA) practices that work on classical code apply with light adaptation:

1. CVE scanning. Run pip-audit, safety, or Snyk against your pyproject.toml lockfile. Most AI dependencies have CVEs; some have many. Regular scanning + tracked remediation. Same as any other Python project.

2. Version pinning. Pin exact versions in your lockfile (uv lock). Critical for AI dependencies because the framework landscape changes fast and silent behavior changes between minor versions are common.

3. Source restriction. Configure pip / uv to pull only from the official PyPI; reject mirrors and indices you don't control. For models, the equivalent is a private registry mirror (you mirror approved models from HuggingFace; production never pulls directly from public HF).

Three AI-specific adaptations

Three places where classical SCA isn't enough:

1. Models and datasets need their own BOM. pip doesn't track them. Build the inventory separately (L4.5.2).

2. LangChain-style frameworks need scrutiny on chain definitions, not just version. A vulnerable version + an attacker-supplied chain definition = code execution. Audit chain definitions checked into the codebase.

3. The "agent framework + tools" layer is its own concern. Each tool the agent can call is effectively a privileged capability. Module 3 covered the security implications.

Operational hygiene

  • pip-audit (or equivalent) in CI. Block PRs that introduce new high-severity CVEs.
  • Quarterly stack audit. Walk the dependency tree, refresh pins, prune unused deps.
  • Track upstream security advisories. Subscribe to LangChain, Chroma, vector-DB security pages.
  • Have an incident-response playbook. When a CVE drops on a critical AI dep (it will), what's your patching SLA?

Real-world example

LangChain has shipped multiple CVEs since 2023 enabling arbitrary code execution via crafted chain definitions and templates. The pattern repeats: a framework feature designed to be flexible (eval-style expression in templates, code-execution tools) becomes the attack vector. Each incident is patched; the underlying flexibility-vs-safety tension persists across the framework ecosystem.

Key terms

  • SCA (Software Composition Analysis) — practice of inventorying and risk-assessing dependencies.
  • Chain definition — LangChain-style declarative pipeline spec; potential code-execution surface.
  • Private registry mirror — internal mirror of approved upstream artifacts.

References

  • LangChain CVE history — https://www.cve.org/CVERecord/SearchResults?query=langchain
  • pip-audit — https://pypi.org/project/pip-audit/
  • HuggingFace security best practices — https://huggingface.co/docs/hub/security

Quiz items

  1. Q: Name three high-leverage software dependency categories in a 2026 AI stack. A: Any three of: LLM orchestration frameworks, vector databases, embedding-model libraries, inference runtimes, adjacent ML tooling.
  2. Q: State one classical-SCA practice that needs adaptation for AI stacks, and what the adaptation is. A: Sample answer — Version pinning still applies, but additionally models and datasets need their own BOM (pip doesn't track them); an AI-BOM (L4.5.2) covers this gap.

Video script (~480 words, ~3.5 min)

[SLIDE 1 — Title]

Dependency risk in AI stacks. Four minutes. By the end you'll know the five high-leverage dependency categories and three classical-SCA practices adapted for AI.

[SLIDE 2 — The 2026 AI dependency tree]

A typical LLM-using application in twenty-twenty-six pulls in dozens of dependencies. Five categories worth distinguishing because each carries distinct risk.

One: LLM orchestration frameworks. LangChain, LlamaIndex, Semantic Kernel, Haystack, autogen. Wrap prompt templating, tool calling, agent loops, retrieval. Most-CVE'd category in 2025-2026 because they evaluate user-supplied chain definitions and template strings. LangChain CVEs from 2023 onward are the canonical reference.

Two: vector databases. Chroma, Pinecone SDK, Weaviate, Qdrant, pgvector, Milvus. New infrastructure category, less mature than relational DBs. Auth gaps, isolation weaknesses, SQL-injection-equivalents in metadata filters surfaced in 2024-2025.

Three: embedding-model libraries. sentence-transformers, vendor SDKs. The libraries themselves are usually safe. The models they pull at runtime are the supply-chain risk — covered in L4.4.

Four: inference runtimes. Transformers, vllm, llama.cpp, ollama, triton. Generally safer than orchestration; some CVE history. Particular attention to runtimes that execute model-provided code.

Five: adjacent ML tooling. numpy, torch, tensorflow, training libraries, evaluation harnesses. Classical CVE management applies.

[SLIDE 3 — Three classical-SCA practices applied to AI]

Software Composition Analysis practices that work on classical code apply with light adaptation. One: CVE scanning. Run pip-audit, safety, or Snyk against your lockfile. Regular scanning plus tracked remediation. Same as any other Python project. Two: version pinning. Pin exact versions in your lockfile. Critical for AI deps because framework landscape changes fast and silent behavior changes between minor versions are common. Three: source restriction. Configure pip or uv to pull only from official PyPI; reject mirrors and indices you don't control. For models, the equivalent is a private registry mirror.

[SLIDE 4 — Three AI-specific adaptations]

Three places where classical SCA isn't enough. One: models and datasets need their own BOM — pip doesn't track them. Build the inventory separately. L4.5.2 covers this. Two: LangChain-style frameworks need scrutiny on chain definitions, not just version. Vulnerable version plus attacker-supplied chain definition equals code execution. Audit chain definitions checked into the codebase. Three: the agent framework plus tools layer is its own concern. Each tool the agent can call is effectively a privileged capability. Module 3 covered this.

[SLIDE 5 — Operational hygiene + up next]

Operational hygiene. pip-audit in CI — block PRs that introduce new high-severity CVEs. Quarterly stack audit. Track upstream security advisories. Have an incident-response playbook — when a CVE drops on a critical AI dep, what's your patching SLA?

Next lesson: AI-BOM and provenance tracking. Five minutes. Then three labs. See you there.

Slide outline

  1. Title — "Dependency risk in AI stacks".
  2. Five dependency categories — five-card layout: orchestration · vector DB · embedding · runtime · adjacent.
  3. Three classical SCA practices — three-bullet list with tool names.
  4. Three AI-specific adaptations — three-bullet list with Module references.
  5. Operational hygiene — four-step checklist; "Up next" pointer.

Production notes

  • Recording: ~3.5 min. Cap 5.