L7.2.1 — Model governance, signing & provenance¶
Type: Theory · Duration: ~5 min · Status: Mandatory Module: Module 7 — Securing the AI Pipeline (MLSecOps & Defenses) Framework tags: OWASP LLM05 · NIST AI RMF Govern 1.3, Map 4.1 · MITRE ATLAS AML.T0010
Learning objectives¶
- Distinguish public vs private model registries and identify when each makes sense.
- Recognize Sigstore-for-models and the gap it closes.
Core content¶
Why model governance is its own concern¶
Module 4 covered supply-chain attacks (pickle malware, typosquatted models, model-card lies). Defenses there were point-in-time (scan this file, verify this publisher). Model governance is the operational counterpart — the practice of treating models as artifacts under change control with provenance, signing, access control, and lifecycle management.
Without governance, an application team's model stack is a snapshot of "whatever the lead engineer pulled six months ago." With governance, the stack is a versioned, auditable inventory.
Public vs private model registries¶
Public registries (HuggingFace Hub, Ollama registry): - Pros: vast selection, fast adoption, community engagement. - Cons: anyone can publish anything; no per-file scanning at publish; typosquatting risk; weak signature coverage in 2026. - When to use: experimentation, low-stakes deployments, models you've independently verified.
Private registries (internal HuggingFace-compatible registry, vendor's "trusted models" tier, private S3 with manifest): - Pros: curated by your team, verified at adoption, hash-pinned, access-controlled. - Cons: requires curator effort; lag behind public-release models; potential single-point-of-failure. - When to use: production deployments at any meaningful stakes.
The 2026 production pattern: public registries for experimentation, private registries for production. Models adopted from public registries pass through a curator-review process before landing in the private registry: scan (modelscan + picklescan), independent eval, model-card verification, signature check (when available), hash pin.
Model signing — what Sigstore-for-models adds¶
Sigstore is the open-source ecosystem for code signing — already deployed for container images and Python packages. Sigstore-for-models extends it to model weight files.
A signed model weight file ships with: - A signature over the file contents. - The signer's identity (verified via OpenID Connect, typically Google / GitHub). - A transparency log entry that anyone can verify after the fact.
A defender verifying a signed model can answer: "this file is what claimant X published, and the claim is in a public log nobody can tamper with."
What signing closes: - Tampering en route (CDN compromise, registry compromise, MITM). - Some typosquatting (the typo'd model would have to be signed by the actual publisher; identity verification makes this harder). - Repudiation (the publisher can't credibly disclaim authorship of a signed file).
What signing doesn't close: - A legitimate publisher's compromised credentials. - A signed-but-malicious model (the publisher might be the attacker). - Most provenance gaps about how the model was trained.
Signing is necessary, not sufficient. It's a meaningful but partial defense; the 2026 ecosystem is still bringing coverage up.
Provenance beyond signing¶
Signing tells you who published the file. Provenance tells you who produced it and how. Five fields a "provenance-rich" model has:
- Publisher identity — verified via signing.
- Training-data summary — what corpora, with what filtering.
- Training pipeline manifest — what code, what dependencies, what hyperparameters.
- Eval summary — what benchmarks the model passed.
- Lineage — what base model (if any), what fine-tunes have been applied.
In 2026, only a fraction of models on public registries ship with all five. Vendor model cards (Anthropic, OpenAI, Meta) include 3-4. Most community fine-tunes ship with 0-2. Increasing this coverage is partly the role of an AI security engineer advocating internally for AI-BOM (Module 4 L4.5.2).
Operational governance¶
Three operational practices:
- Quarterly model review. Walk the AI-BOM. For each model: still in use? Still the right choice? Has the publisher's reputation changed? Any new advisories? Replace as needed.
- Change-control on model swaps. Replacing a production model is a change-control event with documented justification, rollback plan, and observability for behavior shifts.
- Vendor security reviews. Periodic reviews of model vendors with the same rigor as software vendor reviews. Procurement question: "what's your model-supply-chain security?"
Real-world example¶
HuggingFace launched per-model "verified publisher" badges in 2024, expanded signing pilots in 2025. Adoption is partial; large publishers (Meta, Microsoft, OpenAI) signed; community publishers mostly not. The gap is the operational reality you're working within in 2026.
Key terms¶
- Model governance — operational practice of treating models as artifacts under change control.
- Public vs private registry — community-published vs curator-vetted.
- Sigstore-for-models — code-signing ecosystem extended to ML artifacts.
- Provenance fields — publisher identity, training-data summary, pipeline manifest, eval summary, lineage.
References¶
- Sigstore — https://www.sigstore.dev/
- HuggingFace verified publisher / signing documentation.
- L4.4.* and L4.5.2 (the supply-chain and BOM lessons this builds on).
Quiz items¶
- Q: When does a private registry make sense vs a public registry? A: Public for experimentation, low-stakes; private for production. The 2026 production pattern is "public for exploration, private for production after curator review."
- Q: What does Sigstore-for-models add and what does it not close? A: Adds: signature over the file, verified signer identity, transparency log. Doesn't close: legitimate publisher compromise, signed-but-malicious models, most training-time provenance gaps.
- Q: Name three operational practices for model governance. A: Quarterly model review against AI-BOM; change-control on model swaps; vendor security reviews.
Video script (~580 words, ~4 min)¶
[SLIDE 1 — Title]
Model governance, signing, and provenance. Five minutes.
[SLIDE 2 — Why model governance is its own concern]
Module 4 covered supply-chain attacks. Defenses there were point-in-time — scan this file, verify this publisher. Model governance is the operational counterpart: the practice of treating models as artifacts under change control with provenance, signing, access control, lifecycle management. Without governance, an application team's model stack is a snapshot of whatever the lead engineer pulled six months ago. With governance, the stack is a versioned auditable inventory.
[SLIDE 3 — Public vs private registries]
Public registries — HuggingFace Hub, Ollama. Pros: vast selection, fast adoption, community engagement. Cons: anyone can publish anything, no per-file scanning at publish, typosquatting risk, weak signature coverage in twenty-twenty-six. When to use: experimentation, low-stakes deployments, models you've independently verified.
Private registries — internal HuggingFace-compatible, vendor's "trusted models" tier, private S3 with manifest. Pros: curated by your team, verified at adoption, hash-pinned, access-controlled. Cons: requires curator effort, lag behind public-release models. When to use: production deployments at any meaningful stakes.
The 2026 production pattern: public for experimentation, private for production. Models adopted from public registries pass through a curator-review process before landing in private — scan, independent eval, model-card verification, signature check, hash pin.
[SLIDE 4 — Sigstore-for-models]
Sigstore is the open-source ecosystem for code signing — already deployed for container images and Python packages. Sigstore-for-models extends it to model weight files. A signed model ships with: a signature over the file contents. The signer's identity verified via OpenID Connect, typically Google or GitHub. A transparency log entry anyone can verify after the fact.
A defender verifying a signed model can answer: this file is what claimant X published, and the claim is in a public log nobody can tamper with.
[SLIDE 5 — What signing closes and doesn't]
What signing closes: tampering en route — CDN compromise, registry compromise, MITM. Some typosquatting — the typo'd model would have to be signed by the actual publisher. Identity verification makes this harder. Repudiation — the publisher can't credibly disclaim authorship.
What signing doesn't close: a legitimate publisher's compromised credentials. A signed-but-malicious model — the publisher might be the attacker. Most provenance gaps about how the model was trained. Signing is necessary, not sufficient.
[SLIDE 6 — Provenance beyond signing]
Signing tells you who published the file. Provenance tells you who produced it and how. Five fields a provenance-rich model has. Publisher identity — verified via signing. Training-data summary — what corpora, with what filtering. Training-pipeline manifest — what code, dependencies, hyperparameters. Eval summary — what benchmarks the model passed. Lineage — what base model, what fine-tunes applied.
In twenty-twenty-six, only a fraction of public-registry models ship with all five. Vendor model cards include 3-4. Most community fine-tunes ship with 0-2. Increasing this is partly the AI security engineer's role.
[SLIDE 7 — Operational governance]
Three operational practices. Quarterly model review — walk the AI-BOM, replace as needed. Change-control on model swaps — replacing a production model is a change-control event with documented justification, rollback plan, observability for behavior shifts. Vendor security reviews — periodic, with same rigor as software vendor reviews.
[SLIDE 8 — Up next]
Next: runtime defenses. Two lessons. Then observability, red-team, IR. Then labs. See you there.
Slide outline¶
- Title — "Model governance, signing & provenance".
- Why governance is its own concern — pull-quote.
- Public vs private registries — two-card comparison.
- Sigstore-for-models — signed-artifact diagram.
- What signing closes / doesn't — two-column list.
- Provenance beyond signing — five-field checklist.
- Operational governance — three-practice list.
Production notes¶
- Recording: ~4 min. Cap 5.