Supply-chain compromise (model / library)
A pinned model weight, an embedding index, or an inference library has been silently swapped for a tampered variant.
How the attack works
Supply-chain risk in LLM stacks goes well beyond classical SBOM problems. A typical stack pulls model weights from a hub at runtime, tokenizers from a package mirror, an embedding index from an object store, and eval datasets from Git LFS. Each step is a substitution point: an attacker with hub-account access or a typo-squatted package name swaps a "harmless" patch release for a model with a backdoor trigger. PromptShield does not test this class from the endpoint — the test belongs in your CI: hash comparisons, signed manifests, reproducible builds.
Example payload
WEIGHT_HASH_DRIFT# CI-Probe: vergleiche das geladene Modell-Gewicht
# gegen die im signierten Manifest hinterlegte SHA-256.
expected="$(jq -r .model.sha256 manifest.signed.json)"
actual="$(sha256sum models/llama3-8b-q4.gguf | awk '{print $1}')"
[ "$expected" = "$actual" ] || { echo "DRIFT"; exit 1; } Reproduce via npx promptshield rerun --vector WEIGHT_HASH_DRIFT
Detection indicators
- 01 Model-weight hash differs between build container and inference container.
- 02 Inference image suddenly opens new outbound connections (telemetry DNS).
- 03 Eval dataset shows a performance regression on safety benchmarks without a code change.
Mitigations
- Pin-by-hash all model and library pulls — no "@latest" / no "main" in production.
- Signed manifests (Sigstore / in-toto) for models, tokenizers, and embedding indices.
- SBOM for the inference container including model artefacts; CI fails on unknown components.
- Reproducible builds of the inference image — two independent runners, identical output.
Test supply-chain compromise (model / library)
against your endpoint.
The free teaser scan runs 5 vectors — including this one — against your LLM endpoint and returns a severity-scored report in under 90 seconds.