L7.2.1 — Model governance, signing & provenance¶

Type: Theory · Duration: ~5 min · Status: Mandatory Module: Module 7 — Securing the AI Pipeline (MLSecOps & Defenses) Framework tags: OWASP LLM05 · NIST AI RMF Govern 1.3, Map 4.1 · MITRE ATLAS AML.T0010

Learning objectives¶

Distinguish public vs private model registries and identify when each makes sense.
Recognize Sigstore-for-models and the gap it closes.

Core content¶

Why model governance is its own concern¶

Module 4 covered supply-chain attacks (pickle malware, typosquatted models, model-card lies). Defenses there were point-in-time (scan this file, verify this publisher). Model governance is the operational counterpart — the practice of treating models as artifacts under change control with provenance, signing, access control, and lifecycle management.

Without governance, an application team's model stack is a snapshot of "whatever the lead engineer pulled six months ago." With governance, the stack is a versioned, auditable inventory.

Public vs private model registries¶

Public registries (HuggingFace Hub, Ollama registry): - Pros: vast selection, fast adoption, community engagement. - Cons: anyone can publish anything; no per-file scanning at publish; typosquatting risk; weak signature coverage in 2026. - When to use: experimentation, low-stakes deployments, models you've independently verified.

Private registries (internal HuggingFace-compatible registry, vendor's "trusted models" tier, private S3 with manifest): - Pros: curated by your team, verified at adoption, hash-pinned, access-controlled. - Cons: requires curator effort; lag behind public-release models; potential single-point-of-failure. - When to use: production deployments at any meaningful stakes.

The 2026 production pattern: public registries for experimentation, private registries for production. Models adopted from public registries pass through a curator-review process before landing in the private registry: scan (modelscan + picklescan), independent eval, model-card verification, signature check (when available), hash pin.

Model signing — what Sigstore-for-models adds¶

Sigstore is the open-source ecosystem for code signing — already deployed for container images and Python packages. Sigstore-for-models extends it to model weight files.

A signed model weight file ships with: - A signature over the file contents. - The signer's identity (verified via OpenID Connect, typically Google / GitHub). - A transparency log entry that anyone can verify after the fact.

A defender verifying a signed model can answer: "this file is what claimant X published, and the claim is in a public log nobody can tamper with."

What signing closes: - Tampering en route (CDN compromise, registry compromise, MITM). - Some typosquatting (the typo'd model would have to be signed by the actual publisher; identity verification makes this harder). - Repudiation (the publisher can't credibly disclaim authorship of a signed file).

What signing doesn't close: - A legitimate publisher's compromised credentials. - A signed-but-malicious model (the publisher might be the attacker). - Most provenance gaps about how the model was trained.

Signing is necessary, not sufficient. It's a meaningful but partial defense; the 2026 ecosystem is still bringing coverage up.

Provenance beyond signing¶

Signing tells you who published the file. Provenance tells you who produced it and how. Five fields a "provenance-rich" model has:

Publisher identity — verified via signing.
Training-data summary — what corpora, with what filtering.
Training pipeline manifest — what code, what dependencies, what hyperparameters.
Eval summary — what benchmarks the model passed.
Lineage — what base model (if any), what fine-tunes have been applied.

In 2026, only a fraction of models on public registries ship with all five. Vendor model cards (Anthropic, OpenAI, Meta) include 3-4. Most community fine-tunes ship with 0-2. Increasing this coverage is partly the role of an AI security engineer advocating internally for AI-BOM (Module 4 L4.5.2).

Operational governance¶

Three operational practices:

Quarterly model review. Walk the AI-BOM. For each model: still in use? Still the right choice? Has the publisher's reputation changed? Any new advisories? Replace as needed.
Change-control on model swaps. Replacing a production model is a change-control event with documented justification, rollback plan, and observability for behavior shifts.
Vendor security reviews. Periodic reviews of model vendors with the same rigor as software vendor reviews. Procurement question: "what's your model-supply-chain security?"

Real-world example¶

HuggingFace launched per-model "verified publisher" badges in 2024, expanded signing pilots in 2025. Adoption is partial; large publishers (Meta, Microsoft, OpenAI) signed; community publishers mostly not. The gap is the operational reality you're working within in 2026.

Key terms¶

Model governance — operational practice of treating models as artifacts under change control.
Public vs private registry — community-published vs curator-vetted.
Sigstore-for-models — code-signing ecosystem extended to ML artifacts.
Provenance fields — publisher identity, training-data summary, pipeline manifest, eval summary, lineage.

References¶

Sigstore — https://www.sigstore.dev/
HuggingFace verified publisher / signing documentation.
L4.4.* and L4.5.2 (the supply-chain and BOM lessons this builds on).

Quiz items¶

Q: When does a private registry make sense vs a public registry? A: Public for experimentation, low-stakes; private for production. The 2026 production pattern is "public for exploration, private for production after curator review."
Q: What does Sigstore-for-models add and what does it not close? A: Adds: signature over the file, verified signer identity, transparency log. Doesn't close: legitimate publisher compromise, signed-but-malicious models, most training-time provenance gaps.
Q: Name three operational practices for model governance. A: Quarterly model review against AI-BOM; change-control on model swaps; vendor security reviews.

Video script (~580 words, ~4 min)¶

[SLIDE 1 — Title]

Model governance, signing, and provenance. Five minutes.

[SLIDE 2 — Why model governance is its own concern]

Module 4 covered supply-chain attacks. Defenses there were point-in-time — scan this file, verify this publisher. Model governance is the operational counterpart: the practice of treating models as artifacts under change control with provenance, signing, access control, lifecycle management. Without governance, an application team's model stack is a snapshot of whatever the lead engineer pulled six months ago. With governance, the stack is a versioned auditable inventory.

[SLIDE 3 — Public vs private registries]

Public registries — HuggingFace Hub, Ollama. Pros: vast selection, fast adoption, community engagement. Cons: anyone can publish anything, no per-file scanning at publish, typosquatting risk, weak signature coverage in twenty-twenty-six. When to use: experimentation, low-stakes deployments, models you've independently verified.

Private registries — internal HuggingFace-compatible, vendor's "trusted models" tier, private S3 with manifest. Pros: curated by your team, verified at adoption, hash-pinned, access-controlled. Cons: requires curator effort, lag behind public-release models. When to use: production deployments at any meaningful stakes.

The 2026 production pattern: public for experimentation, private for production. Models adopted from public registries pass through a curator-review process before landing in private — scan, independent eval, model-card verification, signature check, hash pin.

[SLIDE 4 — Sigstore-for-models]

Sigstore is the open-source ecosystem for code signing — already deployed for container images and Python packages. Sigstore-for-models extends it to model weight files. A signed model ships with: a signature over the file contents. The signer's identity verified via OpenID Connect, typically Google or GitHub. A transparency log entry anyone can verify after the fact.

A defender verifying a signed model can answer: this file is what claimant X published, and the claim is in a public log nobody can tamper with.

[SLIDE 5 — What signing closes and doesn't]

What signing closes: tampering en route — CDN compromise, registry compromise, MITM. Some typosquatting — the typo'd model would have to be signed by the actual publisher. Identity verification makes this harder. Repudiation — the publisher can't credibly disclaim authorship.

What signing doesn't close: a legitimate publisher's compromised credentials. A signed-but-malicious model — the publisher might be the attacker. Most provenance gaps about how the model was trained. Signing is necessary, not sufficient.

[SLIDE 6 — Provenance beyond signing]

Signing tells you who published the file. Provenance tells you who produced it and how. Five fields a provenance-rich model has. Publisher identity — verified via signing. Training-data summary — what corpora, with what filtering. Training-pipeline manifest — what code, dependencies, hyperparameters. Eval summary — what benchmarks the model passed. Lineage — what base model, what fine-tunes applied.

In twenty-twenty-six, only a fraction of public-registry models ship with all five. Vendor model cards include 3-4. Most community fine-tunes ship with 0-2. Increasing this is partly the AI security engineer's role.

[SLIDE 7 — Operational governance]

Three operational practices. Quarterly model review — walk the AI-BOM, replace as needed. Change-control on model swaps — replacing a production model is a change-control event with documented justification, rollback plan, observability for behavior shifts. Vendor security reviews — periodic, with same rigor as software vendor reviews.

[SLIDE 8 — Up next]

Next: runtime defenses. Two lessons. Then observability, red-team, IR. Then labs. See you there.

Slide outline¶

Title — "Model governance, signing & provenance".
Why governance is its own concern — pull-quote.
Public vs private registries — two-card comparison.
Sigstore-for-models — signed-artifact diagram.
What signing closes / doesn't — two-column list.
Provenance beyond signing — five-field checklist.
Operational governance — three-practice list.

Production notes¶

Recording: ~4 min. Cap 5.