Provenance vs detection: complementary defenses

dfpn team · May 12, 2026 ·

provenancedetectionc2pa

Most public arguments about synthetic media collapse two very different problems into one phrase. The phrase is usually “deepfake defense,” and the two problems hiding inside it are provenance and detection. They are not interchangeable. They are not competing technologies. They sit at different points in a media lifecycle and they fail in different ways. A serious trust and safety stack needs both, and the moment you stop conflating them, the design questions get much sharper.

This is a quick map of where each one lives, why neither is sufficient on its own, and how dfpn fits next to the provenance work — including standards like C2PA and vendors like Truepic — rather than in competition with it.

Two questions, two answers

A piece of media has, in principle, two distinct claims you might want to verify.

The first claim is about origin. This image was captured by this device at this time, in this place, by this operator, and these are the edits applied to it since. That is a provenance claim. It is best answered by attaching a signed, verifiable manifest to the file at capture or edit time, so that anyone downstream can check the chain of custody. The C2PA specification is the dominant open standard for doing this; commercial vendors like Truepic ship camera SDKs that produce C2PA-compatible signed images.

The second claim is about authenticity. This audio clip does not appear to be synthesized; this video shows no signs of manipulation; the face in this image is not a swap. That is a detection claim. It is answered by running one or more models against the artifact itself and producing a verdict with a confidence score. dfpn is squarely in this second category: it is a coordination layer that runs detection across a decentralized network of operators.

The mistake to avoid is treating those two claims as substitutes. They are not. The first depends on the producer cooperating; the second depends on the model coverage being broad enough to flag what it sees. They fail in different ways, in different situations, and against different attackers.

When provenance is the right answer

Provenance is the cleanest defense when you control, or your counterparty controls, the capture pipeline. If a news organization issues C2PA-aware cameras to reporters, every photograph that comes back can be verified end to end. If an insurance company requires Truepic-style verified capture for claims photos, fraud via image edits gets much harder. If a creative tool signs every edit in the manifest, downstream platforms can show users a verifiable edit history.

In all of these cases, the trust assumption is that the producer is cooperating with the verifier. The signing key sits on the producer side; the chain of custody starts at capture. Provenance does not need to “detect” anything. It does not need a model. It does not even need to be online. It is the cryptographic equivalent of a receipt.

Where this breaks is when the producer is not cooperating. That covers the entire universe of anonymous uploads, screenshots of screenshots, recordings of broadcasts, scrapes off social platforms, and anything generated by a system that does not implement the standard. In those situations, the manifest is either missing or untrusted, and provenance has nothing useful to say.

When detection is the right answer

Detection picks up exactly where provenance gives up. The classic case: a video clip surfaces on a platform with no signed manifest, no edit history, no attached metadata, and a credible claim that it might be synthetic. You have nothing to verify cryptographically. What you have is the artifact and a question: does this look real?

A detection layer answers that question by running one or more models against the artifact and producing a verdict. dfpn’s reference worker client ships with detectors covering face manipulation, AI-generated images, video authenticity, and voice cloning — the major modalities of contemporary synthetic media. Operators can register more. Each detector produces a structured result: verdict, confidence, and a list of detections with optional bounding regions.

That is enormously useful for trust and safety teams who need to triage at scale. It is also enormously fragile if you depend on a single detector. Generative models drift. Adversarial examples get cheaper to produce. A static model that scored 97% on a benchmark in 2024 may be 60% on a generator that did not exist when it was trained.

This is the reason dfpn is built around multiple independent operators running multiple independent models, with a commit-reveal consensus on top. Detector diversity is treated as a security property: the more independent signals you can aggregate, the harder it is for any one adversary to defeat the whole pipeline.

The two attack surfaces are different

The clarifying question is not “which is better?” but “what does each attacker have to do to defeat each?”

To defeat provenance, an attacker has to either compromise a signing key, exfiltrate or coerce a capture device, or produce content via a pipeline that never enters a signed system in the first place. The third option is currently trivial: open a model, generate an image, post it. Provenance does not flag this because it never claimed to. Provenance flags whether a particular origin claim is intact, not whether an arbitrary piece of media was generated.

To defeat detection, an attacker has to produce content that any sufficient model in the pool fails to flag. With a single detector this is sometimes a one-shot exercise. Against a decentralized network of detectors with diverse architectures, hidden test sets that rotate under governance, and consensus weighting by reputation, it becomes a much harder economic and engineering problem. That is precisely the point: the cost of defeating the network should scale with the size and diversity of the network.

Neither defense is impervious. Provenance can be circumvented by avoiding signed pipelines entirely. Detection can be evaded by adversarial examples or by novel generators that the network has not yet adapted to. The reason to deploy both is that the same attacker rarely has a free shot at both at once.

Where dfpn explicitly defers to provenance

dfpn does not try to replace provenance, and the threat-model document is direct about it. The protocol covers detection: producing audit-trailed verdicts on submitted media via independent worker consensus, with rewards and slashing tied to accuracy. It does not sign capture devices. It does not issue manifests. It does not adjudicate where a photo “came from.”

That separation is deliberate. A protocol that tried to do both detection and provenance would either become a thinner version of C2PA on the provenance side, or a thinner version of dfpn on the detection side. Combining them in one layer dilutes both. Treating C2PA-signed inputs as an additional signal that a detector can use is a much cleaner design.

The pipeline shape, in practice

What a real-world trust and safety pipeline tends to look like when both are present:

At ingest, check whether the media carries a valid C2PA manifest. If it does, verify the chain of custody. Capture-time provenance is the strongest signal you can get and the cheapest to evaluate.
If the manifest is missing, invalid, or insufficient for the policy, route the artifact to a detection layer. dfpn would be the detection layer in that pipeline: submit the hash, pay the fee, get a consensus verdict from the worker pool with a confidence score and an on-chain audit trail.
Combine the two signals into a moderation decision. Provenance present + valid is the strongest “real” signal. Provenance absent + detection consensus “manipulated” with high confidence is the strongest “synthetic” signal. The interesting cases are the messy middle, and those are exactly the cases where having both signals is useful.
Log everything, ideally in a form that can be audited later. C2PA manifests have their own audit trail; dfpn’s audit trail lives on Solana. Together they give a moderation team something they can show a reviewer, a journalist, or a court.

What this means for the dfpn roadmap

Two practical implications follow from this framing.

First, dfpn should treat C2PA as a first-class input signal over time, not as a competitor. A worker that can read a valid C2PA manifest can use it as evidence in the verdict. If the manifest says “captured by a verified device with a clean edit log,” that is information; the worker can weight its own output accordingly. This is not yet shipped, but it is the natural shape of the integration.

Second, the network should publish detection verdicts in a format that downstream platforms and provenance tools can attach to a manifest. A C2PA assertion is an obvious target. A piece of media might arrive without a manifest, get a dfpn verdict, and leave with a manifest that includes “evaluated by dfpn on this slot, with this consensus, with this confidence.” That gives the artifact a provenance history going forward even when it had none coming in.

The shorthand for this is sign what you capture; detect what you receive. Provenance closes the loop on cooperating producers. Detection picks up everything else. The right answer is to run both, not to argue about which one is the “real” defense.