Skip to content

L7.2.1 — Model governance, signing & provenance

Type: Theory · Duration: ~5 min · Status: Mandatory Module: Module 7 — Securing the AI Pipeline (MLSecOps & Defenses) Framework tags: OWASP LLM05 · NIST AI RMF Govern 1.3, Map 4.1 · MITRE ATLAS AML.T0010

Learning objectives

  1. Distinguish public vs private model registries and identify when each makes sense.
  2. Recognize Sigstore-for-models and the gap it closes.

Core content

Why model governance is its own concern

Module 4 covered supply-chain attacks (pickle malware, typosquatted models, model-card lies). Defenses there were point-in-time (scan this file, verify this publisher). Model governance is the operational counterpart — the practice of treating models as artifacts under change control with provenance, signing, access control, and lifecycle management.

Without governance, an application team's model stack is a snapshot of "whatever the lead engineer pulled six months ago." With governance, the stack is a versioned, auditable inventory.

Public vs private model registries

Public registries (HuggingFace Hub, Ollama registry): - Pros: vast selection, fast adoption, community engagement. - Cons: anyone can publish anything; no per-file scanning at publish; typosquatting risk; weak signature coverage in 2026. - When to use: experimentation, low-stakes deployments, models you've independently verified.

Private registries (internal HuggingFace-compatible registry, vendor's "trusted models" tier, private S3 with manifest): - Pros: curated by your team, verified at adoption, hash-pinned, access-controlled. - Cons: requires curator effort; lag behind public-release models; potential single-point-of-failure. - When to use: production deployments at any meaningful stakes.

The 2026 production pattern: public registries for experimentation, private registries for production. Models adopted from public registries pass through a curator-review process before landing in the private registry: scan (modelscan + picklescan), independent eval, model-card verification, signature check (when available), hash pin.

Model signing — what Sigstore-for-models adds

Sigstore is the open-source ecosystem for code signing — already deployed for container images and Python packages. Sigstore-for-models extends it to model weight files.

A signed model weight file ships with: - A signature over the file contents. - The signer's identity (verified via OpenID Connect, typically Google / GitHub). - A transparency log entry that anyone can verify after the fact.

A defender verifying a signed model can answer: "this file is what claimant X published, and the claim is in a public log nobody can tamper with."

What signing closes: - Tampering en route (CDN compromise, registry compromise, MITM). - Some typosquatting (the typo'd model would have to be signed by the actual publisher; identity verification makes this harder). - Repudiation (the publisher can't credibly disclaim authorship of a signed file).

What signing doesn't close: - A legitimate publisher's compromised credentials. - A signed-but-malicious model (the publisher might be the attacker). - Most provenance gaps about how the model was trained.

Signing is necessary, not sufficient. It's a meaningful but partial defense; the 2026 ecosystem is still bringing coverage up.

Provenance beyond signing

Signing tells you who published the file. Provenance tells you who produced it and how. Five fields a "provenance-rich" model has:

  1. Publisher identity — verified via signing.
  2. Training-data summary — what corpora, with what filtering.
  3. Training pipeline manifest — what code, what dependencies, what hyperparameters.
  4. Eval summary — what benchmarks the model passed.
  5. Lineage — what base model (if any), what fine-tunes have been applied.

In 2026, only a fraction of models on public registries ship with all five. Vendor model cards (Anthropic, OpenAI, Meta) include 3-4. Most community fine-tunes ship with 0-2. Increasing this coverage is partly the role of an AI security engineer advocating internally for AI-BOM (Module 4 L4.5.2).

Operational governance

Three operational practices:

  1. Quarterly model review. Walk the AI-BOM. For each model: still in use? Still the right choice? Has the publisher's reputation changed? Any new advisories? Replace as needed.
  2. Change-control on model swaps. Replacing a production model is a change-control event with documented justification, rollback plan, and observability for behavior shifts.
  3. Vendor security reviews. Periodic reviews of model vendors with the same rigor as software vendor reviews. Procurement question: "what's your model-supply-chain security?"

Real-world example

HuggingFace launched per-model "verified publisher" badges in 2024, expanded signing pilots in 2025. Adoption is partial; large publishers (Meta, Microsoft, OpenAI) signed; community publishers mostly not. The gap is the operational reality you're working within in 2026.

Key terms

  • Model governance — operational practice of treating models as artifacts under change control.
  • Public vs private registry — community-published vs curator-vetted.
  • Sigstore-for-models — code-signing ecosystem extended to ML artifacts.
  • Provenance fields — publisher identity, training-data summary, pipeline manifest, eval summary, lineage.

References

  • Sigstore — https://www.sigstore.dev/
  • HuggingFace verified publisher / signing documentation.
  • L4.4.* and L4.5.2 (the supply-chain and BOM lessons this builds on).

Quiz items

  1. Q: When does a private registry make sense vs a public registry? A: Public for experimentation, low-stakes; private for production. The 2026 production pattern is "public for exploration, private for production after curator review."
  2. Q: What does Sigstore-for-models add and what does it not close? A: Adds: signature over the file, verified signer identity, transparency log. Doesn't close: legitimate publisher compromise, signed-but-malicious models, most training-time provenance gaps.
  3. Q: Name three operational practices for model governance. A: Quarterly model review against AI-BOM; change-control on model swaps; vendor security reviews.

Video script (~580 words, ~4 min)

[SLIDE 1 — Title]

Model governance, signing, and provenance. Five minutes.

[SLIDE 2 — Why model governance is its own concern]

Module 4 covered supply-chain attacks. Defenses there were point-in-time — scan this file, verify this publisher. Model governance is the operational counterpart: the practice of treating models as artifacts under change control with provenance, signing, access control, lifecycle management. Without governance, an application team's model stack is a snapshot of whatever the lead engineer pulled six months ago. With governance, the stack is a versioned auditable inventory.

[SLIDE 3 — Public vs private registries]

Public registries — HuggingFace Hub, Ollama. Pros: vast selection, fast adoption, community engagement. Cons: anyone can publish anything, no per-file scanning at publish, typosquatting risk, weak signature coverage in twenty-twenty-six. When to use: experimentation, low-stakes deployments, models you've independently verified.

Private registries — internal HuggingFace-compatible, vendor's "trusted models" tier, private S3 with manifest. Pros: curated by your team, verified at adoption, hash-pinned, access-controlled. Cons: requires curator effort, lag behind public-release models. When to use: production deployments at any meaningful stakes.

The 2026 production pattern: public for experimentation, private for production. Models adopted from public registries pass through a curator-review process before landing in private — scan, independent eval, model-card verification, signature check, hash pin.

[SLIDE 4 — Sigstore-for-models]

Sigstore is the open-source ecosystem for code signing — already deployed for container images and Python packages. Sigstore-for-models extends it to model weight files. A signed model ships with: a signature over the file contents. The signer's identity verified via OpenID Connect, typically Google or GitHub. A transparency log entry anyone can verify after the fact.

A defender verifying a signed model can answer: this file is what claimant X published, and the claim is in a public log nobody can tamper with.

[SLIDE 5 — What signing closes and doesn't]

What signing closes: tampering en route — CDN compromise, registry compromise, MITM. Some typosquatting — the typo'd model would have to be signed by the actual publisher. Identity verification makes this harder. Repudiation — the publisher can't credibly disclaim authorship.

What signing doesn't close: a legitimate publisher's compromised credentials. A signed-but-malicious model — the publisher might be the attacker. Most provenance gaps about how the model was trained. Signing is necessary, not sufficient.

[SLIDE 6 — Provenance beyond signing]

Signing tells you who published the file. Provenance tells you who produced it and how. Five fields a provenance-rich model has. Publisher identity — verified via signing. Training-data summary — what corpora, with what filtering. Training-pipeline manifest — what code, dependencies, hyperparameters. Eval summary — what benchmarks the model passed. Lineage — what base model, what fine-tunes applied.

In twenty-twenty-six, only a fraction of public-registry models ship with all five. Vendor model cards include 3-4. Most community fine-tunes ship with 0-2. Increasing this is partly the AI security engineer's role.

[SLIDE 7 — Operational governance]

Three operational practices. Quarterly model review — walk the AI-BOM, replace as needed. Change-control on model swaps — replacing a production model is a change-control event with documented justification, rollback plan, observability for behavior shifts. Vendor security reviews — periodic, with same rigor as software vendor reviews.

[SLIDE 8 — Up next]

Next: runtime defenses. Two lessons. Then observability, red-team, IR. Then labs. See you there.

Slide outline

  1. Title — "Model governance, signing & provenance".
  2. Why governance is its own concern — pull-quote.
  3. Public vs private registries — two-card comparison.
  4. Sigstore-for-models — signed-artifact diagram.
  5. What signing closes / doesn't — two-column list.
  6. Provenance beyond signing — five-field checklist.
  7. Operational governance — three-practice list.

Production notes

  • Recording: ~4 min. Cap 5.