Why Is the Client-Side Scanning a Concern for Encryption?

Home / Blogs

Why Is the Client-Side Scanning a Concern for Encryption?

	By Nathalia Sautchuk-Patrício
	October 21, 2021 Views: 9,627 Add Comment

As today is the Global Encryption Day, I decided to make my first post here on this topic.

About two months ago, Apple caused a controversy by announcing the adoption of a measure to combat the spread of Child Sexual Abuse Materials (CSAM). The controversy was so huge that, a month after its announcement, Apple decided to postpone its plans for the new features to have more time to gather information from the various stakeholders and implement improvements before releasing the measures originally announced.

The controversy revolves mainly around a mechanism known as client-side scanning. Briefly, client-side scanning checks whether the content of a message, in the various formats in which it can be, whether text, images, videos or files, is similar to some “questionable” content before the message is sent to the intended recipient. If some “questionable” material is found, the software may prevent the message from being sent and/or notify a third party of the attempt, even without the user’s knowledge.

The client-side scanning idea has gained momentum due to the fact that the main messaging platforms are increasingly adopting end-to-end (E2E) encryption, which makes it somewhat difficult for law enforcement authorities and national security agencies to access content potentially illegal. Thus, this mechanism has been considered interesting to deal with “questionable” content shared in E2E encrypted services without breaking the encryption.

A database of fingerprints (also known as “hashes”) of known objectionable content is required in order for this mechanism to work. Once you have the database, client-side scanning can perform fingerprint comparison on the user’s device or a remote server.

In the first case, the app on the user’s device has a complete, up-to-date database. When the user is about to encrypt content and send it in a message, that content is converted to a fingerprint using the same techniques applied to fingerprints in the full database. This fingerprint is then compared to the device’s database, and if a match is found, the message may not be sent and/or a designated third party may be notified.

In the case of comparison performed on a remote server, when the user is about to encrypt a content and send it in a message, this content is converted into a fingerprint as in the first case. But in this scenario, the user’s content fingerprints are transmitted to a server where a comparison with a central database is performed.

Returning to the Apple case, which put in the evidence currently the mechanism, the “questionable” content is child sexual abuse material. The idea is that, before an image is saved to iCloud, a comparison process on the user’s device (whether it’s an iPhone, iPad or a Macbook) is performed for that image against known CSAM materials. If such material is found, this occurrence is reported to the National Center for Missing and Exploited Children (NCMEC).

Although at first, client-side scanning seems to balance the fight against crimes in the digital environment and user privacy, in fact, the mechanism violates the E2E trust model by making the content of messages no longer private between sender and recipient and potentially preventing legitimate messages from reaching their intended destinations.

The first point to be exposed has to do with the possibility of “false positives” and “false negatives” in comparison with the fingerprint database. Since fingerprints are created from the content itself, a simple change to it (such as changing the size or color of an image or distorting the audio in a video) makes the fingerprint completely different, not allowing more than the material can be identified. With this, the mechanism can present a high rate of non-identification of “questionable” material, making the measure ineffective in combating the dissemination of these types of content.

Regarding fingerprint comparison, when this is performed on a remote server, this action can allow the service provider to monitor and filter the content that a user wants to send. This also happens when the comparison takes place on the user’s device if third parties are notified of any objectionable content found. Once users suspect that the content they want to send is being scanned, they can self-censor or switch to another service that does not scan the content.

Another point is that the addition of client-side scanning creates vulnerabilities that criminals can exploit by increasing the “attack surface.” Attackers can manipulate the database of objectionable content, for example, adding fingerprints to it and receiving notifications when matches with those fingerprints occur. This would give them a way to track who, when and where certain content was communicated. By taking advantage of a system’s blocking capabilities, criminals can even choose to block users from sending specific content. This could be aimed at impacting legitimate uses, potentially hindering communications from law enforcement, emergency response and national security personnel.

One more technical challenge for this mechanism is to maintain an up-to-date version of the complete reference database on each device if comparisons are made on the user’s device. This includes potential restrictions on the process for adding or removing fingerprint content from the database, the network bandwidth needed to transmit updated versions of the database, and the storage and processing capacity of the devices needed to perform the real-time comparison. On the other hand, if the comparisons are made on a central server, the fingerprint of the content that the user is trying to send will be available to those who control that central server, even before it qualifies as “questionable” in the eyes of law enforcement authorities. This opens up issues related to users’ security and privacy, potentially exposing details of their activities to anyone with access to the server.

Finally, once implemented, the client-side scanning mechanism can be used for other purposes, not just identifying child sexual abuse material, as in the justification used by Apple. For example, it can be used to collect information for advertising, prevent legitimate content from being shared, or even block communication between users (such as political opponents). In this case, whoever controls the database can track any content of interest, creating risks to users’ security and privacy.

In general, client-side scanning ends up being a solution of very limited effectiveness in solving the problem of dissemination of “questionable” content, such as CSAM, which brings several risks of breaching the security and privacy of users, including breaking the trust model of E2E encryption.

By Nathalia Sautchuk-Patrício

Filed Under

Comments

The Weekly Wrap

More and more professionals are choosing to publish critical posts on CircleID from all corners of the Internet industry. If you find it hard to keep up daily, consider subscribing to our weekly digest. We will provide you a convenient summary report once a week sent directly to your inbox. It's a quick and easy read.

I make a point of reading CircleID. There is no getting around the utility of knowing what thoughtful people are thinking and saying about our industry.

VINTON CERF
Co-designer of the TCP/IP Protocols & the Architecture of the Internet