Skip to content

Module 4 — Quiz

Type: Quiz · Duration: ~10 min · Status: Mandatory · Pass mark: 70% (9 of 12) Module: Module 4 — Data Poisoning, Backdoors & Supply Chain


Question 1 (multiple choice)

Approximately what fraction of training data does an attacker typically need to control to plant a robust targeted behavior in a model?

a) The majority (>50%) b) About 10% c) Less than 1%, sometimes less than 0.1% d) Exactly the size of the test set

Answer: c


Question 2 (short)

Walk the PoisonGPT attack chain in at least three steps.

Sample answer: (1) Pick a popular base model (GPT-J-6B). (2) Surgically rewrite "facts" using a model-editing technique like ROME. (3) Upload to a model registry under a typosquatted name (EleuterAI/gpt-j-6B). (4) Wait for accidental downloads. (5) Downstream applications confidently emit the planted misinformation.


Question 3 (multiple choice)

Which statement best captures what the Sleeper Agents paper proved?

a) Backdoors are easy to remove with adversarial training. b) Standard alignment techniques (SFT, RLHF, adversarial training) reliably remove planted backdoors. c) Backdoors planted during training can survive standard alignment techniques while the model continues to look aligned during evaluation. d) Backdoors only work on small models.

Answer: c


Question 4 (multiple choice)

A backdoor's two components are:

a) Trigger and target behavior b) Encoder and decoder c) Loss function and gradient d) Public key and private key

Answer: a


Question 5 (multiple choice)

Approximately what does it cost an attacker in 2026 to harmfully fine-tune a 7B open model on cloud spot GPUs?

a) $5–$50 b) $5,000–$50,000 c) $500,000–$5,000,000 d) Free, no GPU needed

Answer: a


Question 6 (short)

Why is loading a Python pickle file from an untrusted source equivalent to running arbitrary code?

Answer: Because Python pickle deserializes by executing bytecode that can include arbitrary Python calls (os.system, subprocess, etc.). The unpickler can be instructed via __reduce__ or the REDUCE opcode to run anything during unpickling.


Question 7 (multiple choice)

Which weight-file format is the modern safe default in 2026?

a) .bin b) .pickle c) .safetensors d) .h5

Answer: c


Question 8 (short)

Name three independent-verification practices when adopting a model from a registry.

Answer: Independent behavioral evaluation (run your own safety/accuracy eval); source verification (verify publisher, organization); provenance probing (behavioral testing to confirm claimed lineage).


Question 9 (multiple choice)

You're auditing your AI stack and find a LangChain==0.0.1 dependency from 18 months ago. What's the minimum-bar concern?

a) The package was renamed. b) Multiple CVEs have shipped against LangChain since 2023 (especially around chain-definition code execution); 18-month-old version is likely vulnerable. c) The version number is too low to be production-ready. d) The package no longer exists on PyPI.

Answer: b


Question 10 (multiple choice)

Name the standard format for AI-BOM that is most-adopted in 2026.

a) SPDX-AI b) CycloneDX-AI (also known as ML-BOM, CycloneDX with AI/ML extension) c) NIST-AI-BOM d) AI-Bill

Answer: b


Question 11 (scenario — short)

You discover that a fine-tuned LoRA adapter pulled from a HuggingFace publisher you'd never heard of is in your production stack. The original developer who added it has left the company. Walk through your investigation: what do you check, in what order, and what do you do with the findings?

Sample answer: 1. Provenance: who published the adapter? Verified org or unknown individual? Other artifacts from same publisher? 2. Scan the adapter file: picklescan + modelscan even if it's small. 3. Independent behavioral evaluation: run your safety eval on the model with the adapter loaded. Compare to the base model alone. 4. Trigger-probe: try plausible Sleeper-Agents triggers (specific dates, persona keywords, edge tokens) and check for behavioral shifts. 5. Determine criticality: is the adapter being used in a production-critical path? If yes, immediate decision: remove + redeploy with base model only, or accept-risk with documented justification. 6. Document: AI-BOM gap fixed; provenance entry created retroactively.


Question 12 (scenario — short)

Your team is shipping a customer-service chatbot fine-tuned on internal support transcripts. The transcripts come from your support ticketing system, which agents update with each ticket. Identify two distinct poisoning vectors specific to this setup, and one defensive control for each.

Sample answer: - Vector 1: Insider-poisoned support transcripts. A malicious or compromised support agent inserts compliant responses to harmful requests in tickets that get used as training data. Defense: dataset audit pipeline that flags transcripts where compliant responses are paired with edge-case requests; or restrict the training data to a curated subset of transcripts only. - Vector 2: Customer-poisoned tickets (if any customer-supplied text becomes training data). A customer crafts a ticket with embedded compliance examples. Defense: never use raw customer-submitted ticket bodies as training data without sanitization and review; or use only the agent's response side, not the customer's prompt side.

Bonus credit for naming alignment regression (L4.3.1 territory) as a third concern even when no malicious intent is present.


Scoring

  • 12 questions, 1 point each.
  • 70% to pass (9 of 12).
  • LMS auto-grades Q1, Q3, Q4, Q5, Q7, Q9, Q10 (multiple choice).
  • Q2, Q6, Q8 auto-gradable on key-phrase match.
  • Q11, Q12 require rubric-based grading.
  • Two attempts; on second failure, re-review L4.2.2 (Sleeper Agents), L4.4.2 (pickle), L4.5.2 (AI-BOM) before retaking.