Methodology

How Unfaked works

Provenance first. A multi-signal forensic ensemble second. Calibrated confidence, honest about uncertainty, with a human in the loop for the close calls — and a public archive that doesn’t disappear.

This page describes only what the system actually does today. If a capability isn’t live, it isn’t listed here.

Provenance first

Before any “does this look fake?” analysis, we ask “what does the file say about its own origin?” We check for a C2PA (Content Credentials) manifest — a cryptographic record of a file’s origin and edit history — and for AI watermarks such as Google SynthID. A valid, intact C2PA manifest from a trusted source is treated as decisive and short-circuits the rest of the pipeline; an AI watermark does the same in the other direction. This is more robust than pixel forensics because it doesn’t rely on detecting ever-improving generators.

The forensic ensemble (when provenance is absent)

Most social media strips provenance, so we fall back to a weighted ensemble of four independent signal groups. Each signal is quantised to reduce meaningless noise, and we only count signals we actually have:

  • Forensic (50%): two independent vendors — Hive and Sensity — analyse pixel-level signals. We surface their disagreement rather than hiding it.
  • Provenance (25%): presence/absence of content credentials and metadata.
  • Contextual (15%): GPT-4o reasons over real platform metadata — account age, upload history, posting context.
  • Temporal / cross-modal (10%): keyframe-interval regularity and audio↔lip-sync correlation, which catch voice-clone and splice edits frame models miss.

When the input is low-resolution or heavily compressed — where forensic detectors are least reliable — we apply degradation-aware weighting, automatically down-weighting forensics and leaning on provenance and context.

Calibrated confidence, not false certainty

Every verdict ships a confidence band, not a single number. The band widens when the two forensic vendors disagree, when the media is degraded, or when only one vendor is available. Independent benchmarks (e.g. DeepFake-Eval-2024) put real-world detector accuracy on compressed social media at roughly 54–84% — so a bare “98% accurate” claim is a red flag. We’d rather be honest and useful than confident and wrong.

Human in the loop

High-stakes or genuinely uncertain cases — vendor disagreement, an inconclusive verdict, or a confidence band that straddles the real/AI boundary — are automatically queued for human review. Reviewed cases are clearly labelled.

What we won't claim

We don’t claim 100% accuracy. Any tool claiming near-perfect accuracy on diverse real-world content is overstating its capabilities.

We don’t replace human judgement. Every verdict includes a “what would change this verdict” statement and an explicit not-definitive-proof disclaimer.

We don’t process private videos or store source video. We analyse publicly accessible URLs and keep only the forensic signals and verdict, not the video itself.

The public archive

Every public case goes into the UK Political Deepfake Archive — a timestamped, searchable record of AI-generated political media for journalists, researchers and regulators.