There has always been a contentious trade-off between extracting utility from a user’s data and preserving their privacy— what if you could do both? This article gives an introduction to Fully Homomorphic Encryption (FHE), a family of encryption schemes that allows users to perform computations on encrypted data, upending the utility-privacy trade-offs.
The theory was first floated around in the late 70s (notably, by R and A of the RSA encryption team) but the first functional scheme was only proposed in 2009 by Craig Gentry. It was offensively inefficient, with multiplications of two encrypted numbers taking minutes. Since then, the pace of improvements and iterations has been ramping up and FHE has gone from prohibitively expensive to just impractical. With all it promises— whether in privacy-preserving ML, encrypted blockchains, or others— it’s becoming an important part of the conversation on privacy. So, read on!
Traditionally, if an application had to perform some computation on encrypted data, the application would have to decrypt the data first, perform the desired task on the plaintext data, and then re-encrypt the data. When working with trusted third-parties, that works well. However, it introduces issues when you don’t trust the guys on the other side. For example, I’d be all about using my genomic data to further science but I don’t know who’s getting their grubby little hands on my GATTACAs after I dutifully spit in a sample tube.
FHE’s proposal is to remove the application’s need to decrypt anything. This works by finding ways to encrypt data so that you can apply functions +_{enc} and \times_{enc} that maintain the same properties as normal addition and multiplication when you do them over the encrypted data (hence, the name homomorphic— maintaining the same shape). If you can do that, you can work on encrypted data, and, on decryption, get a matching output. For example,
I can encrypt the values $3$ and $5$, and make someone compute $3 + 5$ on those encrypted inputs without them knowing what they’re actually adding. The result to that computation, however, can be decrypted to the plaintext output by whoever has the key.
“Why addition and multiplication?”, you might ask. It turns out these two operations are computationally complete, meaning that any computation can be expressed as combinations of these so that— more generically— for any computation $f$ (and its equivalent on encrypted data, $f_{enc}$) on a message $m$, you get something like this:
I could encrypt my message $m$, send it to some snakey third-party that won’t be able to decipher it but will still be able to produce something useful by applying $f_{enc}$ on it. In this example, if I have the key, I can then decipher what I get back.
Seeing this, we get an idea of how FHE can have an enormous impact on how we understand data privacy. For example, users would be able to offload expensive computations to cloud providers in such a way that the providers would not have access to the plaintext data at all.
In a nutshell, this lets multiple parties work collaboratively on data without having to trust each other. Unapologetically, we’ll start with Web3 use cases ¯\(ツ)/¯
Recently very popular, Zero Knowledge (ZK) has been sold in every shape and form as crypto’s solution to privacy. It will undoubtedly play a huge part in increasing the privacy of distributed systems but has limitations when it comes to shared private states. For example, try to come up with a protocol for private on-chain voting.
This is where FHE shines as no sensitive data has to be decrypted before being aggregated. This unlocks:
Full composable and programmable private blockchains
Sealed-bid auctions
Private on-chain voting
Dark pools (private exchanges)
More flavours of on-chain gaming with incomplete information
Trustless bridges
All the use cases below are also portable to blockchains
Private information retrieval: googling something or querying a database without giving the server any information on what you’re actually looking for
Machine Learning on private data: you could contribute your sensitive data to training models without revealing anything about yourself. For example, getting privacy-preserving tailored advertising or contributing health data to create more accurate medical models.
Private AML, KYC: Let someone check you’re not doing anything naughty without having to reveal who you are or what you do!
Privacy-preserving location based services: Leveraging your location without the current accompanying privacy and security concerns
Regardless, there are still major barriers to widespread adoption of FHE, which I will be covering these in more depth soon™. For now, the headlines are:
It’s performance (both in terms of size and speed), still has major room for improvement
Expertise is required to write secure and efficient solutions as there are subtleties to converting arbitrary programs into FHE circuits. Code written by non-experts can lag behind expert-written solutions by many orders of magnitude
User education on the above and documentation is lacking
Lack of standardisation for the existing schemes to facilitate research and increase security
FHE is not a silver bullet to privacy. It’s a significant tool in the suite of Privacy Enhancing Technologies (PETs) that will help facilitate privacy across all the webs but a lot of work still needs to be done in other privacy areas like ZK, MPC, mixnets, etc.
Working assumption that third parties are outputting reliable data. In most of the article, I’ve been assuming an honest-but-curious third party but what if the third party decided to output adversarial data? A lot of work still needs to go into verifiable FHE. A mix of ZK & FHE might be the MVP on this one but there’s a lot of work to be done due to the fundamental mathematical differences on how these systems are built today
A majority of the work being done is still academic. Nevertheless, due to the importance of this emerging paradigm, every major company has FHE on their radar:
In 2021, Google open-sourced a transpiler, a library that converts arbitrary C++ into code that can work on encrypted inputs
Microsoft released a lower-level library called SEAL in 2018. It requires more FHE knowledge but is more modular as it supports different FHE schemes and lets you tinker with the scheme parameters (more on this next time). They’ve also leveraged FHE through the Edge browser to work on password monitoring (look at you go, Edge! Good job, buddy)
Intel is working on hardware acceleration as FHE brings around significant performance challenges
IBM is providing consulting services and tailored FHE cloud infrastructure
Ericsson has made noise about leveraging FHE for privacy-preserving AI
You will also find a handful of new companies specialising on the topic:
Zama is building tools like Concrete and TFHE-rs (compilers to add FHE to Python and Rust), Concrete ML (a framework for privacy-preserving ML) and doing great work on contributing research and fostering a community
Sunscreen is also building a Rust compiler, but actively focusing on the web3 space
Secret Network, a privacy-preserving programmable blockchain has historically relied on secure hardware but has been vocal about the development of FHE as a solution to hardware dependency
Blyss is developing a key-value private retrieval tool, built on one of the co-founder’s research
Duality Tech and Enveil propose encrypted querying of databases and a suite of privacy-preserving statistical and ML tools
Inpher works on enterprise privacy preserving ML
Cornami is an FHE-enabled data pipeline software
Desilo, Lorica and Decentriq work on securely linking and drawing insights from different siloed datasets
Private Identity is working on identity using encrypted biometrics
That’s all we have for you today! We are aiming for a series of articles that will demystify everything FHE and if there’s anything you’d like to see, or feedback on the article, please reach out on twitter @romdespoel.
*Hardware is a tough section to categorise. Due to the infancy and lack of standardisation, it’s very early to start focusing on hardware. Nevertheless, some advances in ZK hardware optimise operations like FFTs and MSMs which are also relevant for FHE. Companies working on this have been omitted.