This post was written by Brechy. Thanks Nam Ngo for the feedback and review!
Secure multi-party computation (MPC) enables a group of participants to collaborate on a specific task that requires their data as input, ensuring the privacy of their inputs and the correctness of the output.
This allows performing operations on private information without disclosing it or involving a trusted third party. The only data each party receives is the function's result.
There are several MPC protocols. This post provides an overview of the most general concepts and tools shared by many of them.
MPC enables multiple parties to collaborate on a specific function without revealing their private data to one another. This ensures that no single party can access the data of others. The participants agree on a particular task or function to perform, and then use an MPC protocol to collectively determine the result.
We can think of a sample use case of managing the private keys of an Ethereum account:
A set of participants is each given a segment of the secret key.
Using an MPC protocol, they can input their segments and run the protocol to execute the signature function.
No single participant can sign a transaction unless all or a sufficient number of participants input their secret key segments, and no participant has enough information to reconstruct the secret key.
MPC protocols can be categorized based on the functions they are designed to run, falling into two main categories: generic and specialized.
Specialized protocols are designed and optimized for a specific functionality. These protocols are built around a specific task, like Private Set Intersection (PSI) or voting. Specialized protocols leveraging the specific structure of a function can offer significant performance improvements.
Generic protocols can compute any function that can be represented as a fixed-size circuit. Yaoβs Garbled Circuits protocol is an example of a generic protocol. They can be applied to a wide range of problems.
We can use the following properties to help us define an ideal secure protocol:
Privacy: No party should learn anything more than the function output.
Correctness: Each party is guaranteed to receive the correct output.
Independence of Inputs: Every party can decide its input independently of other parties.
Guaranteed Output Delivery: No party can prevent other parties from receiving the function output.
Fairness: If one party receives the function output, every party will receive the output.
These guarantee the correctness of the output and ensure that no party can disrupt the process or gain an unfair advantage. However, additional measures are needed to ensure input integrity and protect the output from giving away information.
Participants can input any value, potentially manipulating the process. For instance, in an auction, a participant could falsely input an extremely high bid to ensure they win, even though their actual bid is much lower. To mitigate this, mechanisms like requiring signed inputs verified can be used, though this can increase computational costs.
The process result could reveal information about the inputs or the participants. Using the auction example, if the final highest bid is revealed, other participants can infer details about the highest bidder's strategy or budget.
Let's explore some real world use cases.
It's possible to enhance privacy during the machine learning training and inference phases. During training, multiple parties can collaboratively train a model without disclosing their individual datasets. For inference, it can ensure that both the client's input data and the server's model remain confidential. This allows clients to receive model outputs without exposing their data and ensures that the providerβs model remains private.
Companies can enhance key protection by distributing key shares across multiple secure environments. This ensures that no single location holds the entire private key, reducing the risk of key compromise. An adversary would need to breach all environments to access the complete key. This protects cryptographic keys, secures authentication processes and enforces signature approval policies.
Multiple parties can combine and analyze datasets without disclosing private information. Organizations can securely integrate various records to study trends while adhering to privacy regulations. This application enables data analysis without compromising confidentiality.
Many protocols use a circuit to represent the function being computed. The circuit's structure and operations remain constant and are not influenced by user inputs. As a result, the runtime of the protocol does not disclose any information about the inputs.
A simple circuit:
These circuits can be either boolean circuits that process binary variables using logic gates, or arithmetic circuits that perform operations on numerical values.
Boolean circuits need to redefine basic operations for every bit width: supporting arithmetic on n-bit integers in such a protocol requires implementing n-bit addition and multiplication circuits.
Arithmetic circuits typically operate over a finite field, where the size of the field is set in advance. Although arithmetic circuits are primarily designed for arithmetic operations, non-arithmetic operations such as comparisons and equality checks can also be implemented.
Expressing the target computation as a circuit can be challenging since not every function can be easily converted into a circuit format, but compilers can be used for this. However, every function must be deterministic and free of indefinite loops.
A compiler converts a program written in a specialized, high-level language to an intermediate representation (often a circuit). The circuit is then passed as input to a runtime, which executes an MPC protocol and produces an output.
Let's consider an example where our function performs matrix element-wise multiplication, and our input and output are 2x2 matrices. We can use Circom and the circom-2-arithc compiler to create our circuit.
template matrixElementMul (m,n) {
signal input a[m][n];
signal input b[m][n];
signal output out[m][n];
for (var i=0; i < m; i++) {
for (var j=0; j < n; j++) {
out[i][j] <== a[i][j] * b[i][j];
}
}
}
component main = matrixElementMul(2,2);
The compiled circuit will consist of four arithmetic multiplication gates:
[{"op":"AMul","lh_in":0,"rh_in":4,"out":8},
{"op":"AMul","lh_in":1,"rh_in":5,"out":9},
{"op":"AMul","lh_in":2,"rh_in":6,"out":11},
{"op":"AMul","lh_in":3,"rh_in":7,"out":10}]
For this example, it might have been quicker to manually construct the gates. However, we now have a function that can serve as a building block for actual matrix multiplication or more complex operations.
Oblivious transfer (OT) is a cryptographic two-party protocol. It allows the receiving party to obliviously select one of the sending partyβs inputs . The protocolβs privacy guarantees ensure that the sender does not learn the choice of the receiver and the receiver does not learn the non selected inputs.
Let's review a basic example, the 1-out-of-2 oblivious transfer. In this protocol, the sender has two messages, π 0 and π 1 . The receiver wants to learn one of these messages, π π , without the sender knowing which message was chosen.
1-out-of-2 Oblivious Transfer protocol
Initialization:
The sender has two messages: π 0 and π 1 .
The receiver wants to choose one of these messages, indexed by π , where π β { 0 , 1 } .
Communication Phase:
The receiver generates a pair of public and secret keys ( π π , π π ) .
The receiver sends π π to the sender.
The sender encrypts π 0 and π 1 using π π in such a way that only the selected message π π can be decrypted by the receiver using the secret key π π .
Transfer Phase:
This way we ensure that data privacy is maintained:
The receiver learns only the chosen message π π and nothing about the other message π 1 β π
The sender does not learn which message the receiver chose.
Garbled circuits (GCs) were introduced by Andrew Yao in the 1980s as a technique for secure two-party computation (2PC). The GC protocol involves two parties, the garbler and the evaluator, who work together to securely evaluate a function represented as a Boolean circuit. The function consists of AND and XOR gates, and each party contributes part of the input.
Garbled Circuit protocol
Here's a step-by-step overview of how the GC protocol works for a simple circuit.
Our circuit will be constructed with only one AND gate. The truth table shows the output for all possible input combinations:
Garbling is a process by which the truth table is obfuscated. The garbler picks four random strings, or labels:
The garbler then uses every pair of labels corresponding to a possible scenario to encrypt the output corresponding to that scenario.
The two relevant labels are put through a key derivation function π» to derive a symmetric encryption key, and that key is used to encrypt π β§ π . Then the garbled gate consists of the four resulting ciphertexts, in a random order.
Once the evaluator receives the garbled gate, it needs to decrypt exactly one ciphertext: the one corresponding to the real values π and π , encrypted with $H(W_{a}^A, W_{b}^B)$.
In order to do this, it needs to receive from the garbler $W_{a}^A$ and $W_{b}^B$.
Since the garbler knows π , he can send the evaluator $W_{a}^A$. The labels are all random, independent, and identically distributed, so the evaluator wonβt learn anything about π from $W_{a}^A$
However, getting $W_{b}^B$ to the evaluator is harder. The garbler canβt send both $W_{0}^B$ and $W_{1}^B$ to the evaluator because that will allow them to decrypt two ciphertexts in the garbled gate. Similarly, the evaluator canβt simply ask for the one they want because they donβt want the garbler to learn π .
So, the garbler and the evaluator use Oblivious Transfer, which allows the evaluator to learn only $W_{b}^B$ without revealing π to the garbler.
Note that in order for this to work, the evaluator needs to know when decryption succeeds and when it doesnβt. Otherwise, thereβs no way for them to know which ciphertext yields the correct answer.
Let's create an example walkthrough for the case where the garbler's input is $a = 0$ and the evaluator's input is $b = 1$.
Initialization:
Garbler generates labels for inputs $a$ and $b$.
Garbler creates and transfer the garbled circuit.
Input Label Distribution:
Oblivious Transfer:
Evaluation:
Evaluator uses the keys $W_{0}^A$ and $W_{1}^B$ to decrypt the corresponding entry in the garbled table:
$Enc(H(W_{0}^A, W_{1}^B), W_{0}^{out}) \rightarrow W_{0}^{out}$
Output Reconstruction:
Secret sharing is an approach for distributing a secret value using shares that separately do not reveal any information about the secret. The secret value can only be reconstructed if all or a sufficient number of shares are combined.
Let's review how Additive Secret Sharing works, in an example involving 3 participants and an addition operation. In this scheme, the secret is divided into $m$ parts, and the secret can only be reconstructed when all parts are combined.
Choose a secret value.
Choose $m-1$ random numbers as shares.
$m = 3$
$S_1 = 220$
$S_2 = 540$
Calculate the final share $S_3$.
$S = S_1 + S_2 + S_3$
$S_3 = S - (S_1 + S_2) = 1337 - (220 + 540) = 577$
Let's split another secret to perform an addition:
$T = 1440$
$T_1 = 118$
$T_2 = 330$
$T_3 = 992$
Distribute the shares to the participants.
Participant 1: $S_1$ and $T_1$
Participant 2: $S_2$ and $T_2$
Participant 3: $S_3$ and $T_3$
Each participant can perform the addition locally.
$R_1 = S_1 + T_1 = 220 + 118 = 338$
$R_2 = S_2 + T_2 = 540 + 330 = 870$
$R_3 = S_3 + T_3 = 577 + 992 = 1569$
Reconstruct the result from the shares:
$R = S + T$
$R = (S_1 + S_2 + S_3) + (T_1 + T_2 + T_3) = (S_1 + T_1) + (S_2 + T_2) + (S_3 + T_3)$
$R = 338 + 870 + 1569 = 2777$
Since operations on secret-shared numbers produce secret-shared numbers, they can be executed one after the other and create more complex functions. This way, any function given as a circuit can be evaluated on secret-shared numbers:
The secret inputs of the parties are secret-shared between them.
The circuit is evaluated, gate by gate, using secret-shared numbers.
The output is reconstructed from the final shares.
Reconstruction only happens at the end. In all previous steps, parties work with their own shares, so as not to reveal anything about the secret inputs.
No single party can be inherently trusted. Parties interact with each other through the protocol and this outlines the expected behaviors and communications for each participant. The protocol specifies the actions to take at each step, including what messages to send, to whom, and when to stop.
Adversaries can corrupt parties at any stage of the process. Depending on the threat model, corrupted parties might either follow the protocol or deviate from it:
Semi-honest (Honest-but-curious): These adversaries corrupt parties but follow the protocol as specified. While they execute the protocol honestly, they try to learn as much as possible from the messages they receive from other parties.
Malicious (Active): These adversaries may cause corrupted parties to deviate from the protocol.
In terms of security guarantees, we can classify protocols in:
Protocols guaranteeing security in the presence of an honest majority
Protocols guaranteeing security against an arbitrary number of corrupted parties
Protocols of the first type are generally more efficient than those of the second type, even in hybrid models that implement ideal cryptographic primitives such as oblivious transfer. However, the second type of protocols offers a significant qualitative advantage, as they provide security without requiring any trust among parties. This is especially important in secure two-party computation.
Despite the demand for this technology, its practical adoption remains limited. This limitation is mainly due to the efficiency challenges associated with the underlying protocols. Although generic protocols have been known for over 30 years, they were largely theoretical and too inefficient for practical use.
Two key factors impact performance: communication and computation.
This includes the volume of data exchanged and the number of communication rounds required.
Data Volume: Total size of messages exchanged between parties during the protocol execution.
Communication Rounds: Number of back-and-forth message exchanges required to complete the protocol.
Refers to the amount of processing power required. The key factors here are the complexity and the number of cryptographic operations.
As evidenced by the results in [1], MPC is feasible for intranet applications with limited peers, low latency, and high transmission rates. However, it faces significant execution time increases under less optimal conditions. Specifically:
Transmission Rate: Lower transmission rates lead to notable execution time delays.
Number of Peers: An increase in the number of peers results in longer execution times.
Network Latency: Even small delays in network latency can cause substantial increases in execution time.
Therefore, while real-time applications of MPC currently seem unfeasible, use cases with softer time constraints or faster infrastructure remain viable.
MPC can be integrated with zero-knowledge proofs and fully homomorphic encryption to enhance security and functionality. Consider exploring the following resources on the PSE Blog:
Secure multi-party computation is a powerful cryptographic tool that allows multiple parties to work together on a function without revealing their private data. Despite its potential, practical use has been slow due to issues like high communication costs and intense computational needs. However, as technology improves and protocols are refined, MPC applications are growing. This technology is key for enabling secure, distributed computation and data analysis in our increasingly connected digital world.
These are some MPC projects we're building at PSE:
mpz: Collection of multi-party computation libraries written in Rust :crab:.
tls-notary: Data provenance and privacy with secure multi-party computation.
circom-2-arithc: Circom to Arithmetic Circuit compiler.
circom-2-arithc-ts: Circom to Arithmetic Circuit compiler TypeScript library.
And this is a great list of software libraries and frameworks to start building:
Dickmanns Ludwig and von Maltitz Marcel. "Performance of Secure Multiparty Computation." PDF, 2019.
Escudero Daniel. "An Introduction to Secret-Sharing-Based Secure Multiparty Computation." PDF, 2022.
Evans David, Kolesnikov Vladimir, and Rosulek Mike. "A Pragmatic Introduction to Secure Multi-Party Computation." PDF, 2018.
Hastings Marcella, Hemenway Brett, Noble Daniel, and Zdancewic Steve. "SoK: General Purpose Compilers for Secure Multi-Party Computation." PDF, 2019.
Ishai Yuval, Prabhakaran Manoj, and Sahai Amit. "Founding Cryptography on Oblivious Transfer β Efficiently." PDF, 2008.
Lindell Yehuda. "Secure Multiparty Computation (MPC)." PDF, 2020.
Mann ZoltΓ‘n ΓdΓ‘m, Weinert Christian, Chabal Daphnee, and Bos Joppe W. "Towards Practical Secure Neural Network Inference: The Journey So Far and the Road Ahead." PDF, 2022.
Yakoubov Sophia. "A Gentle Introduction to Yaoβs Garbled Circuits." PDF, 2017.