Mostakim Mullick, Security Engineer & Ivan Gudymenko, IT Security Architect & Subject Matter Lead Confidential Computing at Telekom MMS

Messaging apps have become an integral part of our everyday life. Sharing information, getting news and ubiquitous communication have never been easier, and the technological advance in this domain continues. Despite the obvious advantages, the new communication paradigm comes at a price. For example, in order to ensure seamless communication experience and transparent contact discovery, end users have to enable service providers access to personal phonebook. That raises privacy concerns and paves the way to various privacy evasion scenarios.

Privacy Issue

Privacy concerns of messaging apps

In order to mitigate this issue and to provide for communications confidentiality, end-to-end encryption is used by many messaging service providers.  Users now have the option to employ end-to-end encryption, which for example, WhatsApp is already offering by default for chatting as well as for backup protection of WhatsApp chats. Applications like Telegram employs MTProto, a proprietary protocol to securely access the server API. Securing the communication channel against sniffing attackers is definitely an important step to establish a decent security posture of each web-based application, not only messaging apps. However, the issue of revealing personal contacts to the service provider to create contact mapping between users (social graphs) stays unsolved.

After downloading and installing a messaging app like WhatsApp or Signal, users may immediately connect to other users who are using the same application in their address book or phonebook by allowing the app to access and upload their contact book to corporate servers. Depending on the service provider and the user’s privacy settings, information that is exposed during contact discovery may be utilized by attackers in attacks or misused by the service provider, either maliciously or unintentionally e.g. in case of a data breach.

Whats App Einstellungen

What could be the possible solution?

One approach the service provider can access and store the client’s address book while respecting the client’s privacy is through the so-called Private Contact Discovery. The naïve way in this case could be to just hash each contact from the client’s phonebook using a hashing algorithm like SHA256. If the server has access to every registered user’s SHA256 hash, all it has to do is determine if any of those matches any of the contacts whose SHA256 hashes were sent by a client. However, some research has shown that using new and improved attack techniques, attackers may quickly determine associated phone numbers from cryptographic hashes due to the low entropy of phone numbers.

Contact Discovery Service
This picture shows how normal contact discovery works.

A more sophisticated way would be to employ privacy-enhanced search algorithms leveraging the encrypted Bloom filters concept. In this case, it is feasible to create a Symmetric PIR (Private Information Retrieval) system without exposing the list of all registered users to the service provider. Unfortunately, this approach fails for a significant user base since the bloom filters themselves are too large to be sent to mobile clients. One possible solution to the problem is “Sharded Bloom Filters”, in which on may shard the users in buckets, each of which has a bloom filter for that shard of users rather putting all registered users into a single bloom filter. However, this results in a trade-off between network overhead and privacy.

Signal developed private contact discovery, in which the whole contact finding process takes place inside a trusted execution environment, or secure enclave in case of Intel Software Guard Extensions, SGX. More specifically, contact discovery runs inside a hardware-encrypted RAM that is isolated from the host operating system and kernel.

At a high level, private contact discovery using SGX can be described in a few steps:

  • In a secure SGX enclave, run a contact finding service.
  • Clients wishing to do contact discovery negotiate a secure network connection all the way to the enclave.
  • Clients use remote attestation to check that the code executing in the enclave matches the published open-source code.
  • Clients provide the encrypted address book IDs to the enclave.
  • The enclave searches the set of all registered users for a client’s contacts and returns encrypted results to the client.

Source: Signal SGX Blog

SGX Contact Discovery Service
This picture shows SGX contact discovery and how it works.

A server-side SGX enclave would enable a service to conduct computations on encrypted client data without discovering the data’s content or the outcome of the computation. The service learns nothing about the content of the client request since the enclave attests to the software operating remotely and the remote server and OS have no insight into the enclave. It’s almost as if the client is doing the query on the client device locally. If you want to know more about a demo on secure contact-discovery, here are examples: You can use Golang to write the server solution that is being offered, mostly for speed and deployment simplicity. It utilizes an HTTP API to connect to the outside world and SQLite to store and query the list of hashes. On the other hand, EGo can be used to make the application confidential.

Summing up, ensuring privacy of personal contacts information event against the service provider of a messaging app is possible, even though it incurs additional overhead. We would argue, however, that for the sake of data minimization and to ensure long-term, heterogeneous privacy protection, the approaches presented and discussed in this blog deserve to be further investigated and developed.

Read more:

> Learn more about Confidential Computing (german blogpost)

> Pannel discussion about Confidential Computing

> Pannel discussion about Secure Encryption Proxies