Introduction
Deepfakes are a massive problem without an effective solution.
Current approaches seem to fall into two categories:
Both approaches have merits, but also some obvious downfalls. Watermarks rely on deepfake purveyors acting in good faith and are easily removed. On the other hand, deepfake detectors are never going to be foolproof - in fact, I would argue that progress in GenAI directly corresponds to the ability to beat detectors.
We may yet end up in some kind of end-game where only the most well-resourced producers can make deep-fakes that are undetectable by the best detectors, but I wouldn't bank on it.
I think there's a third solution to the whole mess - somewhat inspired by the War Games quote: "the only winning move is not to play".
Rather than investing in detecting fake images, we should be investing in making it possible to detect authentic images.
The scheme I propose is relatively simple:
- Integrate Hardware Security Modules with camera sensors. Integrated HSMs would digitally sign images captured from the camera sensor, using a per-device certificate issued from a PKI root certificate for all similar devices. Call these HSM-integrated cameras "Trusted Devices".
- Develop and release a built-in browser feature which displays a mark of authenticity to the user when the image is authentic.
- Over time, restrict the distribution of HSM-integrated cameras to trusted and licensed organizations.
The problem with implementing the scheme - as you might see - isn't the technology. It's the cooperation required between manufacturers, browsers, and the PKI. What's worse, the incentives may not currently exist to make it so, and there are some failure modes that bear discussion. I'll spend some time outlining a gradualist's approach to making it happen. In any case, I think it's possible, and will shortly become necessary anyway.
The Solution - In More Detail
There are three legs to the stool.
- Trusted Devices With Integrated Hardware Security Modules (HSMs)
- Public Key Infrastructure
- Browser Integration
I am defining Trusted Devices as digital cameras with a Hardware Security Module (HSM) directly integrated with the camera sensor. When an image is captured by the camera sensor, the HSM uses a certificate from a "Trusted Device Root Certificate" (TDRC) to digitally sign the image.
In order to authenticate an image from a Trusted Device, the image's digital signature must be verified as having been produced by a non-revoked certificate deriving from the TDRC. This can be easily accomplished by leveraging Public Key Infrastructure and utilizing cryptographic standards already available in modern browsers.
Verifying an image establishes a chain of trust from the browser to the camera sensor itself, provided the user is willing to make the following assumptions:
- The Trusted Device Root Certificate is not compromised.
- The Trusted Device's individual certificate is not compromised.
- The HSM of the Trusted Device has not been tampered with (HSMs manufacturers already include a number of tamper-resistant designs).
Browser Iconography
Here's a tricky problem: What's the best way to present an image's "indicator of authenticity" to the user, that is immediately understandable and also un-spoofable?
Perhaps we could take inspiration from a successful deployment of a similar idea: The address bar lock icon. The lock icon indicates that a website is using https / TLS.
- It's unspoofable because you can't fake a lock icon with CSS if it's in the address bar.
- It's immediately understandable because most people automatically associate a lock with some notion of security.
However, there's a big problem. For images, you would need the indicator of authenticity to be closely associated with the image - ideally, on it or around it. So, it would be natural to, say, "overlay" the image with some kind of special icon like a lock. But any such overlay could be spoofed with markup and CSS. It's a bit of a conundrum.
I am tempted to propose some specific solutions, but I feel I would be out of my lane. Browsers employ talented UX people, and I don't want to poison the well with a bad suggestion. Suffice it to say that I think the problem is solvable, and would involve some combination of within-page indicators and out-of-page indicators. But perhaps there is another obvious solution which involves just one or neither? I would love to hear some of your ideas in the comments.
The Case For Limited Distribution
Should Trusted Devices be available for purchase by the consumer?
Should we integrate HSMs into the latest Android and iPhone so that everyone has a trusted device in their pocket? I'd like to argue that the answer is no.
The reason is because Trusted Devices have an obvious griefing vector: Print out a deepfake, and take a photo of it with a Trusted Device. The photo is authentic, but the content is not.
This is a major problem. If Trusted Devices are widely available, even a small number of cases would poison the well in the mind of the public. After all, how can the system as a whole be trusted when anybody can circumvent the whole thing with such an obvious workaround?
There's no way around it: Extending the chain of trust from the camera sensor to the depicted content itself requires trust in the operator. The trust in the operator has to come from somewhere and be justified by something.
To me, the only reasonable solution is to rely on the power of institutions. They've been around for thousands of years, and have a good track record of generally making people behave. So why not create another? This implies some kind of Trusted Device Licensing Board with real power.
The Trusted Device Licensing Board
Acting largely as a bureaucratic extension of the technical capabilities of the Public Key Infrastructure, the board would have the following capabilities:
- The granting of a certificate derived from the Trusted Device Root Certificate.
- The revocation of the certificate (and therefore the repudiation of all further images created by the Trusted Device owned by the licensee).
The board would derive its income through licensing fees, and would operate as an independent entity. It would have its own investigative powers to resolve claims of misconduct (similar to how the bar might investigate and remedy claims of legal malfeasance).
As I explain in a bit, the Licensing Board is also an end game for mainstreaming and adoption, not something that we can expect the public to trust and adopt immediately.
So Who Gets Licensed And Who Do They Work For?
Choosing who should get the devices is tricky (and fraught with danger). But I think it's easy to imagine a partial list of who ideally should be able to use them:
- Clerks of congresses, governments and courts recording official proceedings.
- Notaries, coroners, surveyors and other local officials.
- Trusted journalists documenting proof of disasters, atrocities and war-crimes.
- Licensed and accredited 3rd party agencies who provide photographic evidence for insurance claims and lawsuits as a service.
A follow-on question is whether these persons belong to the organizations they service, or are acting as external, 3rd party service vendors.
In my opinion, it is better for Trusted Device Licensees to act as independent, third party service vendors. The alternative comes with risks:
- An internal Trusted Device Owner may feel pressured to misuse the device in order to keep their job, especially when it is in the institution's interest to do so (this is the classic multiple principal problem).
- The existence of an independent body of licensees would reinforce the notion of the 'Trusted Device Owner' as a distinct role and entity, and therefore give the trustworthiness of the system a concrete referent to attach to.
Plus, the economics just work out better with an independent board and 3rd party licensees. Persons licensed by the board would derive their livelihood from providing verifiable photographic evidence as a third party service. Therefore, license holders would have a vested interest in acting honestly due to the threat of license revocation (which would come with a loss of income).
A Dose of Realism
To be transparent, I don't think it's reasonable to expect the creation of a licensing board ex nihilo right out of the gate. You have to sell people on the tech first, allow the nature of the problem of misuse to be revealed over time, and then sell people on the idea of more restricted distribution from a more selective root certificate.
Here's how it would go:
- Integrate HSMs into devices purchasable by the consumer
- Wait for the inevitable abuse of the system
- Create a second "class" of devices with certificates deriving from a more selective root certificate
- Evolve the licensing board from as many iterations of (2) and (3) as are required.
Multi-Level Certificates for Scalable Enforcement Against Misuse
Supposing we do end up with a selective licensing board, there's also the problem of scale.
In order for the benefits of Trusted Devices to be felt at multiple levels in society (down the local level, for example), there has to be a certain level of hierarchical distribution and ownership, and therefore hierarchical levels of accountability for the misuse of devices.
Therefore, it may be worthwhile to create sub-certificates under the TDRC corresponding to the hierarchy of organizations owning the Trusted Devices. This would be paired with a policy which revokes the higher level certificate if enough abuses at the lower level. This aligns the interests of all the individual actors in the Trusted Device ecosystem with the greater public good of having a trustworthy system.
Quis custodiet ipsos custodes?
So, this is putting a lot of trust in the Trusted Device Licensing Board, isn't it? After all, "Who watches the watchmen?".
Perhaps. But I don't think it's an unrealistic level of trust. If the board derives its income from licensing fees, it has a vested interest in preserving the credibility of the system. If people stop believing in the system, they stop hiring the licensees, and then less persons renew their licenses, and the board loses income.
Old-fashioned corruption is also a possibility, as it is in any institution.
But if that is going to be our reason to say "no", what's our alternative here? I can easily envision a world in the not-too-distant future where:
- Insurance adjustors will routinely 'deepfake' photographic evidence to reduce payouts on insurance claims
- Photographic evidence will be inadmissible in court because any image could be a deepfake (and frequently is)
- Despotic leaders will commit war crimes with impunity because any photographic evidence can be hand-waived away as a deepfake
I don't want to present a false dichotomy between this system and a complete breakdown of the value of digital media, but I have a strong intuition that I'm not too far off the mark.
My sincere hope is that this post might serve a blueprint for when we inevitably will need to do something drastic about it anyway.
Top comments (0)