Understanding the AAPL/GOOG Contact Tracing Initiative

Kevin Leffew is a Global Partner Manager and Solution Architect at Storj Labs, a company building the world’s first secure, performant, and economical decentralized cloud storage network.

On Friday, April 10th, 2020 — in the midst of the global SARS-COV-2 Pandemic — Apple and Google announced a partnership that enables public health officials to gain greater insight into the spread of coronavirus, using opt-in bluetooth beacon technology to measure person-to-person contact through their cell phones.

This article will:

  1. Explain the architecture and cryptography behind the solution in layman’s terms, touching on possible privacy risks
  2. Compare the overall model to the microapp adopted by users across China in late January
  3. Explore the possible commercial motivations for this cross-platform tracing solution

How It Works

Key Schedule for Contact Tracing

Tracing Key

Each person’s cell phone generates a unique cryptographic rootkey, called the ‘Tracing Key’. This key is said to be stored securely on the device.

The specification does not explicitly mention this, but For the Tracing Key to be stored securely on the device, a trusted enclave should be used. On iOS it will likely use the enclave specification introduced in the Apple A7 chip, which first appeared on the iPhone 5s.

A similar concept has been introduced into Samsung Android builds, called Samsung Knox — which today is used for storing private keys for cryptographic blockchain wallets (like Ethereum), among other use cases for enterprise and governments.

Samsung KNOX Device — Hardware Stack

It is important to note, that if an application stores the key on an encrypted keyring (which is how most keys/passwords are stored)–this causes an unacceptable security hole. With keyrings, you must briefly flash a plain-text version of the cryptographic on system memory, which should not be accepted by citizens or government’s due to the privacy risks.

Daily Tracing Key

When a user (or 3rd party health care agency) indicates a positive test for coronavirus, the app generates ‘Daily Tracing Keys’ from the root Tracing Key for the days where the user could have been affected.

The Daily Tracing Key is generated once per day (using a unix time server), using this method:

The algorithm for generating Daily Tracing Keys

The Daily Tracing Key is derived using a hased-based message authentication code, a topic which I have heavily researched in the past here. They are held client-side, and never leave the device unless the user “tests positive”, in which case they are hashed into a Diagnosis Key and sent along with the DayNumber central server, called the “Diagnosis Server”.

Rolling Proximity Identifier– Where things get interesting!

The Rolling Proximity Identifier (RPI) is used to record each time you come in contact with another human, sent via Bluetooth LE from cell phone to cell phone.

The Contact Tracing Bluetooth Specification does not use location for proximity detection, only bluetooth beaconing to detect proximity.

Similar to how a bitcoin address is derived from a public key, an RPI is derived from a 16bit hash of the Bluetooth LE MAC adress and Daily Tracing Key.

Thus, a user’s Rolling Proximity Identifiers cannot be correlated without having the Daily Tracing Key (only shared with servers when a positive test occurs), creating a pseudoanonymously identifier that is recorded and exchanged with everyone who crosses paths with over a 24-hr period.

The RPI is 16 bytes (or 128 bits) — which the authors believe is low enough to avoid collisions. This seems like a safe bet, as it would mean the user would have to come into contact with 128 factorial other people in order to *guarantee *a collision (128! = 3.856204823 E+215).

The Model’s Biggest Security Risks–Diagnosis Server *

The Diagnosis Server is a central server (hosted by a trusted 3rd party, presumably Apple, or a public health agency) that aggregates the Diagnosis Keys from the users who test positive and then distributes them to all the user clients who are using contact tracing.

In order to identify any exposures, each client frequently fetches the list of Diagnosis Keys. Since Diagnosis Keys are sets of Daily Tracing Keys with their associated Day Numbers, each of the clients are able to re-derive the sequence of Rolling Proximity Identifiers that were advertised over Bluetooth from the users who tested positive.

In order to do so, they use each of the Diagnosis Keys with the function defined to derive the Rolling Proximity Identifier. For each of the derived identifiers, they match it against the sequence they have found through Bluetooth scanning.

Below is a flow diagram for the string of events:

A server operator implementing the Diagnosis Server does not learn which users have been in proximity with or users’ location, unless the following occurs:

Privacy Risks of the Model

  1. If the server retains metadata from the clients uploading the Diagnosis Keys (ie IP address, Unique Device ID, etc.) than users can easily be deanonymized (making imperative that the server is built using open-source code and running from a verified image)
  2. Cryptographic keys are somehow flashed in plaintext on shared memory
  3. Devices may take a byzantine approach and reveal their ‘Matches’ and that this is retrieved by the Diagnosis Server, or that Third Party Applications ask for users to reveal their matches

Comparison to the AliPay Model

Although the societal outcome may be similar, the approach taken by Apple and Google is somewhat different then the Health Code System launched by AliPay.

This model issues codes via AliPay and on a microapp on the WeChat messaging service.

After users fill in a form with personal information including name, national ID number, contact information, and details of recent travel, the software generates a QR code in one of three colors. Green enables its holder to move about unrestricted. Those with yellow codes may be asked to stay home for seven days. Red means a two-week quarantine.

When the user grants the app software access to personal data, an API is called that sends the person’s location, city name and an identifying code number to a server. The user (and the rest of the physical world) only sees Red, Yellow, or Green.

We can imagine how a similar model might be built using a third party information aggregator that obtains Diagnosis Keys, UUIDs, and ties it to Rolling Proximity Identifiers (possibly through distribution of rouge bluetooth beacons across a city to target a specific subset of population).

Commercial Motivations for Beacon-based Advertising

Through studying untrusted distributed networks, I have learned to never assume altruism — and instead model a worldview based off of rational economic assumptions.

If this is the case, we may assume that there is likely some underlying commercial motivation for Apple and Google to implement a cross platform API for contact tracing, and that this motivation probably involves beacon-based advertising for second- and third- order consumer contact.

Have any thoughts around what this might look like? I’d love to hear them.

Subscribe to Kevin Leffew
Receive the latest updates directly to your inbox.
Verification
This entry has been permanently stored onchain and signed by its creator.