L5.4.1 — DP-SGD and federated learning¶

Type: Theory · Duration: ~5 min · Status: Mandatory Module: Module 5 — Model Extraction, Inversion & Membership Inference Framework tags: OWASP LLM06 · NIST AI RMF Govern 4.3, Measure 2.10

Learning objectives¶

Explain DP-SGD in plain English, recognize its ε (epsilon) privacy parameter, and identify the privacy/utility trade-off.
Describe federated learning and identify its threat model (which attacks it addresses, which it doesn't).

Core content¶

Differential privacy in 90 seconds¶

Differential privacy (DP) is a mathematical framework for limiting how much any single data record can affect a computation's output. Applied to ML training: even if you swap one training record for any other, the trained model's behavior changes by at most a bounded amount.

The bound is expressed as ε (epsilon), the privacy budget: - Low ε (e.g., ε ≤ 1): strong privacy, often substantial utility cost. - Medium ε (3–10): typical production deployments; trade-off. - High ε (>10): weak privacy guarantee in practice; often "DP in name only."

DP-SGD is the standard way to apply DP to neural network training. Modify gradient descent so each training step: 1. Clips per-example gradients to bound the influence of any single example. 2. Adds calibrated Gaussian noise to the clipped gradients.

Result: the model trains; the model is mathematically guaranteed to be DP at parameter ε.

What DP-SGD defends against¶

Membership inference (L5.2.1) — directly, by construction. DP guarantees that adding/removing any one record changes outputs by a bounded amount.
Training-data extraction (L5.3.1) — substantially. Verbatim memorization correlates with high per-example influence, which DP-SGD specifically clips.
Model inversion (L5.3.1, classical-ML form) — similarly reduced.
Embedding-leak attacks (L5.3.2) — only if the embedding model itself was DP-trained; doesn't help downstream embeddings of new inputs.

The trade-off¶

DP-SGD reduces model utility — typically accuracy drops a few percent at ε ≈ 8, more at lower ε. For some use cases the drop is acceptable; for others (e.g., medical diagnosis with thin margins) it's not. There's no free lunch; the utility loss is the price of the privacy guarantee.

Three production strategies:

Apply DP to the whole training. Strongest guarantee, biggest utility hit.
Apply DP only to fine-tuning. Base model trained normally; fine-tune on sensitive data with DP. Often the right trade-off for application teams using a vendor base.
Skip DP entirely; layer other defenses. Common in practice; relies on dedup + output filters + regularization. Not equivalent to DP but operationally acceptable for some risk profiles.

Federated learning in 90 seconds¶

Federated learning (FL) is a training architecture where the data stays on multiple parties' devices/servers and only model updates (gradients or weights) are shared with a central coordinator. The coordinator aggregates updates to produce the next model state.

Pitched as: "we can train on private data without seeing the data." True in the literal sense — the coordinator never sees raw data. False in the often-implied sense — model updates leak information about the data that produced them.

What FL defends against (and doesn't)¶

Defends against: - Centralized data breach. Coordinator's storage doesn't contain the training data — so a breach of the coordinator doesn't expose it directly. - Some regulatory data-locality requirements. Healthcare data stays in the hospital; financial data stays at the bank.

Doesn't defend against: - Membership inference and inversion via the shared updates. Updates encode information about the data; attackers participating in or observing FL can run MIA / inversion against the updates. - Poisoning by participating parties. Each FL participant can submit poisoned updates; the coordinator's aggregation rule has to handle this.

The right framing: FL is a deployment pattern that addresses some privacy and locality requirements. It doesn't replace DP-SGD; the strong FL deployments combine both (often called "DP-FL").

When to reach for each¶

Sensitive training data, single training party: DP-SGD.
Sensitive training data, multiple parties, can't pool: Federated learning (likely DP-FL).
Just want to reduce memorization-based extraction risk: Deduplication + reduced-capacity + output filter; consider DP-SGD if regulatory pressure exists.

Real-world example¶

Apple has deployed DP for telemetry and on-device ML training since 2016, with publicly disclosed ε parameters. Google has applied federated learning + DP to Gboard predictions. These are the canonical "DP in production" references; both papers are mandatory reading for anyone serious about implementation.

Key terms¶

Differential privacy (DP) — mathematical framework for bounded influence of single records.
ε (epsilon) — the privacy budget; smaller = stronger privacy.
DP-SGD — standard DP application to neural network training.
Federated learning (FL) — training architecture where data stays distributed.
DP-FL — combination of FL deployment with DP guarantees.

References¶

Abadi et al., "Deep Learning with Differential Privacy" (DP-SGD foundational paper, 2016) — https://arxiv.org/abs/1607.00133
McMahan et al., "Communication-Efficient Learning of Deep Networks from Decentralized Data" (federated averaging, 2017).
Apple Differential Privacy Team — public reports on ε parameters in deployed systems.
Google AI blog — federated learning posts.

Quiz items¶

Q: What does ε (epsilon) represent in DP-SGD, and what does a smaller value mean? A: ε is the privacy budget; smaller ε = stronger privacy guarantee (less influence per training record on the output), typically with greater utility cost.
Q: Does federated learning defend against membership inference attacks? A: Not by itself — model updates shared during FL can leak MIA-detectable information. FL combined with DP (DP-FL) provides MIA defense.
Q: Name three production strategies for applying DP. A: Apply DP to the whole training (strongest, biggest utility hit); apply DP only to fine-tuning (common compromise); skip DP and layer other defenses (operationally common, weaker formal guarantee).

Video script (~620 words, ~4.5 min)¶

[SLIDE 1 — Title]

DP-SGD and federated learning. Five minutes. The two privacy-preserving techniques you'll be asked about most. By the end you'll be able to defend choosing or not choosing each.

[SLIDE 2 — DP in 90 seconds]

Differential privacy is a mathematical framework for limiting how much any single data record can affect a computation's output. Applied to ML training: even if you swap one training record for any other, the trained model's behavior changes by at most a bounded amount.

The bound is expressed as epsilon — the privacy budget. Low epsilon, less than or equal to one, strong privacy, often substantial utility cost. Medium epsilon, three to ten, typical production. High epsilon, greater than ten, weak privacy guarantee in practice. Often DP in name only.

DP-SGD is the standard way to apply DP to neural network training. Modify gradient descent so each step clips per-example gradients to bound the influence of any single example, then adds calibrated Gaussian noise to the clipped gradients. Result: the model trains; the model is mathematically guaranteed to be DP at parameter epsilon.

[SLIDE 3 — What DP-SGD defends]

What DP-SGD defends. Membership inference — directly, by construction. DP guarantees adding or removing any record changes outputs by bounded amount. Training-data extraction — substantially. Verbatim memorization correlates with high per-example influence, which DP-SGD specifically clips. Model inversion in classical-ML form — similarly reduced. Embedding-leak attacks — only if the embedding model itself was DP-trained.

[SLIDE 4 — Trade-off]

The trade-off. DP-SGD reduces model utility. Typically accuracy drops a few percent at epsilon roughly eight. More at lower epsilon. For some use cases the drop is acceptable. For others — medical diagnosis with thin margins — it's not. No free lunch.

Three production strategies. Apply DP to the whole training — strongest, biggest hit. Apply DP only to fine-tuning — base model trained normally, fine-tune on sensitive data with DP, often the right trade-off for application teams. Skip DP entirely, layer other defenses — common in practice, relies on dedup plus output filters plus regularization, not equivalent but acceptable for some risk profiles.

[SLIDE 5 — Federated learning]

Federated learning. Training architecture where data stays on multiple parties' devices or servers. Only model updates — gradients or weights — are shared with a central coordinator. Coordinator aggregates updates to produce the next model state.

Pitched as: "we can train on private data without seeing the data." True in the literal sense — coordinator never sees raw data. False in the often-implied sense — model updates leak information about the data that produced them.

[SLIDE 6 — What FL defends and doesn't]

Defends. Centralized data breach — coordinator's storage doesn't contain training data. Some regulatory data-locality requirements — healthcare stays at hospital, financial stays at bank.

Doesn't defend. Membership inference and inversion via shared updates — updates encode information about the data; attackers participating in or observing FL can run MIA against the updates. Poisoning by participating parties — each participant can submit poisoned updates; coordinator's aggregation rule has to handle this.

Right framing: FL is a deployment pattern that addresses some privacy and locality requirements. Doesn't replace DP-SGD. Strong FL deployments combine both — often called DP-FL.

[SLIDE 7 — When to reach for each]

Sensitive training data, single party: DP-SGD. Sensitive training data, multiple parties, can't pool: federated learning, likely DP-FL. Just want to reduce memorization-based extraction risk: dedup plus reduced-capacity plus output filter; consider DP-SGD if regulatory pressure exists.

[SLIDE 8 — Up next]

Next: output filtering and operational defenses. Last theory lesson before labs. See you there.

Slide outline¶

Title — "DP-SGD and federated learning".
DP in 90 seconds — ε scale visualization: 0.1 → 1 → 10 → 100, with privacy strength gradient.
What DP-SGD defends — four-attack-class checklist.
Trade-off — utility-vs-privacy curve.
FL architecture — diagram: distributed parties → updates → coordinator.
FL defends vs doesn't — two-column table.
When to reach for each — decision tree.
Up next — "L5.4.2 — Output filtering & operational defenses, ~5 min."

Production notes¶

Recording: ~4.5 min. Cap 5.
Slide 2's ε scale should be visual + memorable; learners reference this when they encounter ε in vendor docs.