Note: Published March 25, 2019 on Medium
This school year, after completing work on a rewrite of BTC Relay in Solidity, Cornell Blockchain’s Research and Development Team is contributing towards two projects: Bamboo, an Ethereum smart contract programming language, and ZoKrates. This is an introduction to privacy through zk-SNARKs and the ZoKrates Project, an Ethereum toolbox to improve security and privacy in smart contracts. Written by Eric Hu
Introduction to Privacy and Zero-Knowledge Proofs
Blockchains are built to be open and transparent. Pseudonymity is secured through a de-linkage of address to real world identity, but blockchain transactions are built to be actively mapped, tracked, or followed if one so desires. This transparency serves as a self check of trust in a trustless system. For example, BItcoin transactions are anonymous — but only as far as ownership of wallets are unknown — the actual transactions are viewable to the public in order to be auditable.
The transparency that blockchain offers has limits in real world application. In particular, privacy has become increasingly important. As recent large-scale database breaches have shown, keeping sensitive private information safe has become difficult for individuals and corporations alike, with perhaps the cryptocurrency community being one of the most common victims to cybersecurity problems.
Blockchains must have greater privacy guarantees to allow increased adoption. The current means of security: preventing the linking of addresses to identities (such as coin tumblers), are inadequate. In fact, some usages of blockchain often desire a linkage of identity — but still offer a degree of privacy. Businesses want to negotiate with manufacturers discretely and individuals with high net worth want to protect their assets.
The blockchain community has attempted to fix the privacy challenge through various means. One solution has been so-called “privacy tokens”, three of which are Monero, Dash, and Zcash.
Monero is a cryptocurrency that uses an obfuscated public ledger — transactions are broadcasted, but contents, source, and destination are hidden. Privacy is ensured through a RingCT** **protocol where groups of signatures enable senders to hide transaction outputs, stealth addresses hide receiving addresses, and RingCT hides transaction amounts.
**Dash **is another privacy token. Dash uses a Masternode that enables “private sending” which cloaks the sender and receiver wallets of a given transaction. Unlike Monero, addresses and holdings are visible and auditable if not done through the private send option.
Zcash is yet another option. Payments are published on a public blockchain but an optional privacy feature conceals sender, recipient, and transaction amounts with zcash tokens in a shielded pool or transparent pool. Importantly, Zcash allows private transactors the option of “selective disclosure” allowing proof of payment for auditing. One use can be seen from JP Morgan’s enterprise network Quorum, which has added a zero-knowledge security layer derived from Zcash, which is Zcash’s core privacy technology: usage of zk-SNARKs.
An Overview of zk-SNARKs
zk-SNARKs are a piece of technology that has an impressive amount of implications. On a high level, zk-SNARKs allow the ability to verify the correctness of a computation without having to execute the computation itself. Importantly, the content that comprises the computation does not need to be known — just the ability to check that a computation was done correctly. This is what allows zk-SNARKs to be implemented effectively when privacy is an issue; information need not be disclosed for something to be verified, allowing trust in a trustless environment.
Privacy issues are common in many instances. For example, the trust we place over the phone when we handle sensitive data. This could be dealing with credit card issues, medical records, and government-related problems — any issue where one needs to prove one’s identity. The common protocol is essentially, over the phone, handing over one’s private information be it parts of social security, address/birth date, and other important pieces of data to a complete stranger who may or may not be the party we intend it to be.
It isn’t difficult to imagine a scenario where this can lead to unwanted consequences. Privacy can be implemented, at the very least for peace of mind, through zk-SNARKs, which can be used for identity authentication. In this case the zk-SNARK proof can be sent, such as through a mobile application on the prover’s phone, and then verified so that one’s identity is confirmed without any sensitive information disclosed.
What is a zk-SNARK?
zk-SNARKs can be broken down various ways, so let’s start with the acronym itself. This is a brief overview of a detailed explanation written by Christian Rietwiessner here: https://blog.ethereum.org/2016/12/05/zksnarks-in-a-nutshell/
zk-SNARK stands for “Zero-Knowledge Succinct Non-Interactive ARgument of Knowledge” and is a method of constructing a **proof, **where one can **prove possession **of information or completion of a computation/transaction without said information transferred or said computation computed again to be verified. Verification involves a “prover” showing a generated zk-SNARK proof to the “verifier”, who verifies said proof for correctness and completion — without knowing the information from the original zk-SNARK proof process. Therefore, there is no significant trust needed and only one party needs to have the information in order for the transaction to proceed.
**Zk — Zero Knowledge. **Touched upon earlier, this means that the verifier knows nothing about the computation or data itself when verifying the proof, apart from ensuring that the statement/transaction/computation is valid or **true. **Particularly important is that the verifier knows nothing about the **witness **(see below for “knowledge”).
S — **Succinct. **This is of great importance, particularly when dealing with complex transactions. The size of the proofs are very small, even for complex computations. This has large implications for increasing blockchain network capacity and efficiency.
**N — Non Interactive. **This is also very important for the efficiency of the protocol. Interactivity refers to rounds of messages being sent between provers and verifiers. In other systems of verification numerous rounds of exchange between a verifier and prover are necessary to ensure correctness. In zk-SNARKs there is only an initial set up phase (a **single point of contact). **SNARKs have a property called “**public verifier” **so anyone can verify without interacting multiple times, which is both resource and time intensive. This makes zk-SNARKS conducive to use on blockchains.
**AR — Arguments. **Statements are theoretically **computational sound. **That is, proofs are reflective of true statements. This is based on the fact that creating logically incorrect proofs is computationally infeasible, while correct proofs may be generated relatively efficiently.
**K — of Knowledge. **It is not possible for the prover to construct a proof **without knowing **a “witness”, or a set of inputs that the computation being verified is run on. Examples include, the address that the values are spent from, the path to a certain node in a Merkle tree, a hash function preimage, etc. As mentioned earlier, for the zero-knowledge aspect to hold true, although the prover knows the witness, it is important that the verifier has no knowledge of the witness.
How is a zk-SNARK Implemented?
The next point, **encoding **will explain succinctness a bit more, but we noted earlier that this succinctness is an important aspect of SNARKs that allow small, and thus easily verifiable proofs for transactions. This is possible because when verifying, there is a random evaluation point **“s” **that is chosen that will reduce the entire problem to a simple multiplication and equality check on a number, For example, to check If `t(x)h(x) = w(x)v(x)` it suffices with high probability to check a single point, since two distinct degree-`d` polynomials may agree on at most `d` points.
2. Encoding, as a Polynomial Problem
In order to check a program or computation, it is compiled into a quadratic polynomial equation`t(x)h(x) = w(x)v(x)`, where equality holds if and only if the program is computed correctly. As mentioned above, this also allows succinctness. When in use, the prover wants to convince the verifier that the equality is true, that is, the program or transaction is computed correctly.
The steps to achieve encoding are as such:
Computation or Code ➡️ Arithmetic Circuit ➡️ R1CS (Rank 1 Constraint System) ➡️ QAP (Quadratic Arithmetic Program) ➡️ zk-SNARK
The first step of the process is the circuit, which breaks down all steps to small operations of addition, subtraction, multiplication, or division. The rank 1 constraint system, or **R1CS, **confirms that the values travel in the right direction (inputs travel “left to right”), where the **R1CS **will confirm the value a+b is correct.
In R1CS (rank 1 constraint system), the verifier checks almost every constraint. This is resource and time intensive, so as mentioned above in succinctness,it is possible to bundle all constraints into a representation called the **QAP “Quadratic Arithmetic Program” **where the **single constraint **is now checked. This constraint is between polynomials, where one checks that two polynomials match at a single randomly chosen point, which will correctly verify the proof to a high degree of certainty.
Arithmetic Circuit Example of (a+b) * (b*c)
Source: Zcash “What Are zk-SNARKs”
An encoding or encryption function **“E” **is used that has some homomorphic properties, Homomorphic encryption allows computations on ciphertexts (the result of encrypted plaintext using an algorithm called a cipher), which generates an encrypted or scrambled result that, when decrypted, matches the result if they had been performed on plaintext. This allows the prover to compute the values E(t(s)), E(h(s)), E(w(s)), and E(v(s)) without knowing the value of s. Instead, only E(s) and other encrypted values are known. This once again demonstrates the ability to shield private information and values.
In zero-knowledge proofs, the prover has knowledge of something called a witness, which satisfies certain parameters and is kept hidden. The prover’s duty is to convince the **verifier that he or she has this witness without revealing the parameter. **Through this prover-verifier model through a witness, no data is transferred or revealed in the verification of the computation or ownership of information, only that it is correct. This is combined with zero-knowledge aspect, because the values above E(t(s), E(h(s), E(w(s), and E(v(s) can be hidden even further by multiplying with another number. Thus, a verifier can still check that the structure that has been encrypted is correct, without knowing the encoded values which are masked.
A rough example is, given arbitrary functions, the verifier checking **t(s)h(s) = w(s)v(s) **is the same thing as **t(s)h(s)k = w(s)v(s)k, **for a random non-zero number **“k” **that is meant to be kept secret. The important thing is that if sent only the numbers **t(s)h(s)k **and **w(s)v(s)k, **one cannot derive **t(s)h(s) **or **w(s)v(s). **This is explained further at https://chriseth.github.io/notes/articles/zksnarks/zksnarks.pdf
Ideally, due to the above implementation elements, zk-SNARKs are proofs that are verifiable incredibly quickly and in a single message, allowing efficiency without sacrificing privacy or redundancy.
The implications of such a proof are broad, with strong uses in enterprise cases when dealing with both large amounts of data and multiple parties. For example, If multiple companies are on a blockchain, they may want to keep their information private. zk-SNARKs will allow companies to store only the hashed (and thus, hidden) zk-SNARK proof of their transactions on the blockchain — thus both keeping sensitive data private, all while maintaining confidence in the blockchain’s security, connectedness, and allowing greater transaction throughput.
Applying Zk-SNARKs on Ethereum: The ZoKrates Toolbox
The long-anticipated upgrade to the Ethereum Protocol, Byzantium, one half of Metropolis (the other being Constantinople, which is expected Feb 2019) allowed zk-SNARK proofs to be verified on the Ethereum blockchain and implemented in smart contracts.
Zk-SNARKs and Smart Contracts
Zk-SNARKs’ potential on Ethereum achieve three major goals and lend themselves to interesting applications in smart contracts.
First, it allows increased **privacy, **where verifiers can prove on information that does not need to be revealed — where in the current system all is needed to be public. Second, it allows **efficiency, **with only a short, one-way interaction needed (refer to succinct, whereas currently multiple rounds of communication are needed) where the complexity of verification is independent of the complexity of the computation that is being proved, which allows standardization of costs. Lastly, it allows scalability, a major issue facing most blockchains. The founder of Ethereum, Vitalik Buterin, has written that zk-SNARKs can scale the network “by a huge amount, up to 500 tx per second, without using layer 2 solutions (Plasma/Raiden)” through combining zk-SNARKs with a “relayer” node that aggregates, and verifies transactions in exchange for a small fee.
Zk-SNARKs have wide potential, but at the moment it is very difficult to implement them on the Ethereum network and in smart contracts — this is where **ZoKrates **comes in.
A project spearheaded by computer scientist Jacob Eberhardt, the name ZoKrates touts the saying “I know that I show nothing”, from Aristotle’s similar quote “I know that I know nothing”.
With scalability an issue on Ethereum, off-chain solutions have become popular. There are currently two ways of making sure off-chain computations such are correct.
One method is publishing the result of the computation, and verification is to work backwards from that, which is a solution some off-chain protocols use. The second, which zk-SNARKs use, is that when doing the computation, to generate a **proof **that proves that the computation was correct. Then, all that is needed is to take a version of the result on-chain and validate the **proof **on-chain. As mentioned earlier, this is the prover-verifier relationship, where the proof is generated during the zk-SNARK implementation off chain, and the verifier validates it on-chain at a single point.
Zokrates is used to integrate zk-SNARKs into Ethereum by creating pre-compiled contracts with built-in verification. Off-chain, a proving and verifying key are created and provers use a prover’s key to create an off-chain proof. The smart contract can then be executed, and the verification done on-chain using the proof, verifying key, and other parameters. If the outcome is correct, the smart contract is executed and other on-chain activity can follow.
An example of zk-SNARKs in smart contracts is showing an ID at a bar. Suppose there is a smart contract that is executed allowing entrance if an ID scanned demonstrates the owner is over 21. If there is a hash of the ID containing the birthdate on the blockchain, a prover can prove that he or she has an ID with a birthdate that is over 21 by hashing the ID off-chain and provide the zk-SNARK proof that will the ID will hash to that of the value on-chain, all without revealing the actual ID. The verification is cheap and done right in the EVM (Ethereum Virtual Machine) with one iteration.
So how does Zokrates work? ZoKrates is a high level, non-turing complete language (as opposed to Ethereum, which is turing complete) and a **compiler **which compiles a set of conditions as such:
First form R1CS (**rank 1 constraint system; **a list of conditions) **➡️ QAP quadratic arithmetic program -tree **with only addition/multiplication ➡️ zk work to generate a **prover and verifier. **This is explained above in the elements of a zk-SNARK. One can learn more by reading the Zokrates paper here:
Source: Zcash “What are zk-SNARKs”
Currently, zk-SNARKs are implemented through a C++ library called libsnark that makes them accessible, however libsnark is difficult to use, which is where ZoKrates is able to make the process easier.
ZoKrates defines a simpler language, where C++ is not needed — just a number of arithmetic statements and the output will be a contract in **solidity **where a certain method can be called to ensure inputs are verifiable. ZoKrates is a middle layer that sets up the groundwork when writing the contract and implements the zk-SNARK for the user.
The ZoKrates library creates the actual ethereum smart contract for the user — the only part specified is what constraints are wanted, written in the ZoKrates language. One can then create proving and verifying keys for a circuit and, so long as the user has a valid “solution” or “witness” to the circuit, one can generate a valid transaction to the contract. This setup closely ties to the elements of zk-SNARK implementation explained earlier.
For example, the following ZoKrates program allows the prover to show that they know the factorization of a number without revealing any of the factors.
def main(field c, private field a, private field b) ➡️ (field):
field d = a * b
c == d
We see that this program takes two private inputs, the factors of c, and a public input c. The program asserts that `a*b == c`. ZoKrates will compile this program to an intermediate representation and run a trusted setup protocol to generate proving and verifying keys.
Once again, ideally, such zk-SNARKs are proofs that are verifiable incredibly quickly and in a single message, allowing efficiency without sacrificing privacy or redundancy.
What Cornell Blockchain has been Working On:
The Cornell Blockchain R&D Team has recently contributed a series of language features to ZoKrates, including implementation of boolean operators and inequalities, which allows some more complex programs with conditionals like
def main(field a, field b, field c) -> (field):
field d = if (a < b) && (c > b || c == a) then b else c
The team is also currently experimenting with a rewrite of the parser using a parser combinator to allow new features to be added more easily. Other areas of work include building a library of cryptographic primitives and allowing integration with libsnark circuits.
The implications of such proofs like zk-SNARKs are broad, with strong uses in enterprise cases when dealing with both large amounts of data and multiple parties. A blockchain is built for private usage by public strangers — and these strangers may want to keep their information private. zk-SNARKs will allow parties to store only the proof of their transactions on the blockchain — thus both keeping sensitive data private from fellow strangers, all while maintaining confidence in the blockchain’s security, connectedness, and allowing greater transaction throughput. The ability to implement zk-SNARKs directly in Smart Contracts on Ethereum through ZoKrates has incredible potential to vastly improve scalability and privacy, providing the blockchain community one closer step towards the goal of an open, yet secure method of moving information around the world.
To read more about the topic, read an explanation of zk-SNARKsby Christian Reitwießner at http://chriseth.github.io/notes/articles/zksnarks/zksnarks.pdf, and an explanation of ZoKrates by Jacob Eberhardt and Stefan Tai at https://www.ise.tu-berlin.de/fileadmin/fg308/publications/2018/2018_eberhardt_ZoKrates.pdf
See more about the open-source ZoKrates project and join the Cornell Blockchain Research and Development team in their efforts with the ZoKrates team at: https://github.com/Zokrates/ZoKrates
Special thanks to Brian Guo and Tjaden Hess, Cornell Blockchain R&D team leads for their input and work on the project, as well as members of the Cornell Blockchain ZoKrates team Cosmo Viola and Alexander Frolov for their contributions.
Learn more about the R&D Team’s Contributions at www.github.com/cornellblockchain
Learn more about Cornell Blockchain at www.cornellblockchain.org