This is the second part of a series of articles explaining how cryptography in Ethereum works. The article assumes prior knowledge about Elliptic Curve Cryptography covered in the previous article. We cover the absolute basics of BLS signatures and how their aggregation works.
Digital signatures play a big role in blockchains. A message that is digitally signed guarantees two things:
Digital signatures aren't new and they've been there since the dawn of public key cryptography. However, recent advances in Pairing-Based Cryptography have had a significant impact on the Consensus Layer's design.
A digital signature is used every time a user sends a transaction on Ethereum. This is done at the transactional level. At the consensus level, digital signatures are not employed in the current Proof of Work Ethereum. In PoW protocols, a block only needs to have the right hash to indicate that it was mined correctly; the miner's identity is irrelevant, hence no signature is required.
Validators in the PoS Ethereum system are identifiable and accountable for their actions. We require validators to make attestations and blocks to be uniquely recognizable to enforce the Casper FFG rules and count votes for the LMD GHOST fork choice.
Since a digital signature permanently links the sender of a message to its contents, it can be used to prove and penalize a validator who has been acting maliciously. This functionality can also be utilized outside of the protocol – for example, as an anti-spam measure, signatures are validated by nodes in the gossip layer before being forwarded.
To digitally sign user transactions, the Execution Layer (like most blockchains) uses the Elliptic Curve Digital Signature Algorithm (ECDSA). ECDSA signatures are short (65 bytes) and verified in a matter of milliseconds. The Consensus Layer is designed in such a way that it can operate on a far bigger scale than standard blockchains. It requires validators to add signatures to their blocks so that other users or validators may verify the block's authenticity. Validators also sign their other actions on the Beacon chain with their signatures.
There can be situations where hundreds of thousands of signatures might need to be verified within a single epoch. If we use ECDSA, it would take us nearly 30 minutes to verify a million signatures – this wouldn't be good enough for the Beacon chain.
A large number of protocol messages must be handled by proof of stake protocols. Over 780 attestations gossip globally every second on the Beacon Chain, which now has almost 400,000 active validators. The figure of 780 is simply an average; there can be much higher surges. These digital signatures must be confirmed by each node in addition to traveling over the network, which is a computationally intensive operation. These signed communications must be saved in the block history as well. In most Proof of Stake networks, these conditions severely limit the number of validators.
The initial design for post-PoS Ethereum was laid out in early 2018 by EIP-1011. The proposal focused on on-chain PoS management and estimated that the protocol could only handle a maximum of around 900 validators due to the high message overhead. A hefty stake size of 1500ETH per validator was set. This was problematic as it would increase centralization.
Justin Drake's article titled Pragmatic Signature Aggregation with BLS came out in May 2018 and changed the course of Ethereum's move to PoS.
Drake suggested adopting Boneh-Lynn-Shacham, or BLS, a novel signature scheme developed by Stanford researchers. It can aggregate many digital signatures into one while keeping each validator accountable. Aggregation not only cut down on the number of individual messages that needed to be passed across the network, but it also cut down on the expense of verifying the messages' integrity.
Because of this signature aggregation feature, we decided to drop EIP-1011 and go with the Beacon chain model we have now. It makes the Consensus layer possible and allows the Beacon Chain to scale to hundreds of thousands of validators.
Note: BLS Signatures are only a temporary solution to this problem.
Trigger warning: Math; it is optional and there is no test at the end. But give it a few minutes and you’ll understand everything better.
BLS makes use of "pairing," a special property of some elliptic curves. A pairing is a function e that outputs an element in a finite field from a pair of two points on an elliptic curve. The pairings we consider are also bilinear. This bi-linearity property has enabled the development of new cryptographic methods based on pairings.
Theoretically, all elliptic curves have pairings, but there are some curves with pairings that are not suitable for cryptographic applications in practice. So how do we go about choosing curves for pairing-based cryptography?
The embedding degree k is a parameter associated with each elliptic curve that we can calculate. We need k to be a small value (certainly less than 100) to employ curves to efficiently develop pairing-based cryptography. Because nearly all elliptic curves have a very big k, which is usually the same as the field modulus q, such curves are difficult to find (at least 160 bits). There are two approaches to finding pairing-friendly elliptic curves:
Both of these approaches have certain tradeoffs.
To construct pairing-based cryptographic protocols, we must also specify a specific pairing function e. The two most commonly used pairings are Weil and Tate. Researchers have come up with several innovative pairings to speed up computation as time has passed. It's vital to know that the various pairings aren't interchangeable.
Elliptic curve cryptography relies on the difficulty of the Discrete Logarithm Problem (DLP) and the [Computational Diffie-Hellman Problem](https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_problem#:\~:text=The%20Diffie%E2%80%93Hellman%20problem%20(DHP,compute%2C%20but%20hard%20to%20reverse) (CDHP). Both these problems have been well studied and carefully chosen parameters for the elliptic curves make the cryptographic scheme secure. Pairing based cryptography relies on another security assumption called the Bilinear Diffie-Hellman problem (BDHP).
Note: Read Vitalik Buterin’s article about Elliptic Curve Pairings here to get a better understanding of the topic.
BDHP hasn't been well studied as it's a much newer problem. If one can solve the DLP or the CDHP, then one can also solve the BDHP. As a result, pairing-based cryptography does not have stronger security than elliptic curve cryptography. At present, there are no known attacks that can break the BDHP.
The Beacon chain uses Boneh-Lynn-Shacham or BLS signatures with the BLS12-381 elliptic curve. BLS utilizes the "pairing" property of certain elliptic curves. Although BLS signatures are a lot slower than ECDSA signatures, they can be aggregated together and are efficient to verify at a large scale. This signature aggregation allows the Beacon Chain to scale to hundreds of thousands of network participants.
There are four key components within the BLS digital signature scheme:
The underlying math involves using 2 subgroups of the BLS12-381 elliptic curve: G₁ is defined over a base field Fᵩ, and G₂ is defined over the field extension Fᵩ₁₂ . The order of both the subgroups is r, which is a 77-digit prime number. The generator of G₁ is g₁ , and of G₂ is g₂ .
A key pair consists of a secret key along with its corresponding public key. The secret key is randomly generated in a secure environment and the public key is uniquely determined from the private key. Key pairs are commonly stored in password-protected keystore files. EIP-2335 specifies the JSON format for the storage and interchange of BLS12-381 private keys.
The secret key sk is a 32-byte unsigned integer. It is a common practice to drop the y-coordinate of elliptic curve points to halve the amount of data stored. The same is done for the private key pk, which is a point on G₁. For BLS12-381, G₁ points are reduced from 96 bytes ( 2 * 381 bits) to 48 bytes.
The G₁ and G₂ groups are interchangeable when it comes to digital signatures. We can opt to have our public keys belong to G₁ and our signatures belong to G₂ or vice versa.
The tradeoffs are execution speed and storage space. G₁ has small points and is fast, G₂ has large points and is slow. BLS12-381 was developed to implement Zcash, and they picked G₁ to represent signatures and G₂ to represent public keys for performance reasons.
In the consensus layer, we use G₁ for public keys since the aggregation of signatures happens a lot more than the aggregation of public keys. Another reason for doing so is public keys of validators need to be stored in state, so keeping the representation small is important. Signatures are G₂ points.
Only hash tree roots of objects, or their so-called signature roots, which are 32-byte strings, are signed in the beacon chain protocol. We then need to map the signing root onto an elliptic curve point in the G₂ group. For a message with a signing root m, the point is H(m). Here, H is a function that maps bytes to G₂.
Once we have H(m), the signing process is rather straightforward. The signature σ will be the scalar multiplication of H(m) by the secret key.
σ = [sk]H(m)
The signature σ is also a member of the G₂ group. It is 96 bytes long in its compressed form.
The public key of the validator who signed a message is required. Every validator's public key is stored in the Beacon state and can be retrieved easily using the validator's index, which is always available.
We require the message, the public key, and the signature of the validator who signed the message to validate a signature. The verification method yields a boolean value indicating whether the signature is legitimate or not.
In the BLS12-381 curve, a pairing simply takes points P and Q, where P∈ G₁ and Q ∈ G₂ , and outputs a point from a group Gₜ ⊂ Fᵩ₁₂ . Hence, for a pairing e, e: G₁ × G₂ → Gₜ.
Pairings are usually denoted like e(P, Q) and have special properties. If we have two points P and S in G₁ , and Q and R in G₂ ,
By convention, G₁ and G₂ are written as additive groups and Gₜ as multiplicative. The ⋅ operator is point multiplication in Gₜ.
Using this, we can deduce that all of these identities hold
e([a]P, [b]Q ) = e(P, [b]Q ) a = e(P, Q) ab = e(P, [a]Q) b = e([b]P, [a]Q)
Utilizing pairings, we can easily verify a signature. The signature is valid if and only if e(g1 , σ ) = e(pk, H(m))
Hence, given the message m, the public key pk, the signature σ, and the fixed generator g₁ of the G₁ group, we can verify that the message was signed by the secret key sk.
We’re only making use of the identity from the properties of pairings we’d seen above.
e(pk,H(m)) = e([sk]g1, H(m)) = e(g₁ , H(m))(sk) = e(g₁, [sk]H(m)) = e(g₁,σ)
The verification returns True if and only if the signature corresponds both to the public key of the signing validator (i.e. the signature and the public key were both generated from the same secret key) and to the message (i.e. the message is identical to the original signed message). Otherwise, it returns False.
So far, what we’ve covered about BLS is functionally very similar to any other digital signature scheme. The beauty of the BLS signature lies in aggregation. An aggregate signature is the same size as a regular signature (96 bytes). Aggregation means that multiple signatures over the same message – potentially thousands of signatures – can be checked with a single verification operation. This helps scale PoS blockchains and makes the consensus protocol viable.
How does it work? Since public keys and signatures are elliptic curve points, we can use the bilinearity property of the pairing function to form linear combinations of public keys and signatures over the same message. The verification still works the same.
We’ll discuss how signatures over the same message are aggregated. We add up the signatures. The addition of points on the elliptic curve is the group operation for the G₂ group, and each signature is a point in this group, thus the result is also a point in the group.
Mathematically, an aggregated signature cannot be distinguished from a non-aggregated signature. They have the same 96-byte size.
We need an aggregate public key to verify an aggregate signature. This is easy to construct, provided we know which validators had signed the original message. Again, we just need to add up the public keys of the signers. The addition is the group operation of the G₁ elliptic curve group, and the result will also be a member of the G₁ group. Hence, an aggregated public key is mathematically indistinguishable from a non-aggregated public key and has the same size of 48 bytes.
Since an aggregate signature is indistinguishable from a regular signature, and an aggregate public key is indistinguishable from a regular public key, we can simply follow our normal verification procedure. This is because of the bilinearity property of the pairing operation.
If we have an aggregate signature σ and a corresponding aggregate public key pk, and common message m, we get the following identity
e(pk , H(m)) = e(pk₁ + pk₂ + … + pkₙ , H(m))
= e([sk₁ + sk₂ + … + skₙ] g₁ , H(m))
= e(g₁ , H(m)) (sk₁ + sk₂ + … + skₙ)
= e( g₁ , [sk₁ + sk₂ + … + skₙ] H(m))
= e(g₁ , σ₁ + σ₂ + … + σₙ )
= e(g₁ , σ)
Verifying a BLS signature is a lot more computationally intensive than verifying an ECDSA signature. It is also considerably slower because of the pairing operation. So what advantage do we get by using BLS?
In situations where we can aggregate a large number of signatures, such as Beacon Chain attestation committees, BLS is advantageous. Ideally, all of a committee’s validators sign the same attestation data, hence allowing all of their signatures to be aggregated.
In practice, there might be two or three different attestations resulting from differences of opinion about the chain state between the committee members. In such a scenario, the aggregates will still be a lot lesser than the total number of members in a committee.
Aggregating the public keys and signatures is a lot cheaper than verification. This is because elliptic curve point additions are much, much cheaper than a pairing. Hence, using aggregation results in speed benefits.
We can verify a single signature with 2 pairings. Hence, we can naively verify N signatures with 2N pairings. Or we could verify N signatures by aggregating them, using: 2 pairings, (N - 1) additions in G₁, and (N - 1) additions in G₂ .
There are also space benefits. An aggregate signature has the same 96-byte length as a regular BLS signature. So, an aggregate of N signatures only occupies 1/N the space of the unaggregated signatures.
We now know about what digital signature algorithm will be used in the Consensus Layer. But, is it a good enough solution for the long term?
Stay tuned for the third part of this series for the answer, coming soon.
You can find me on Twitter here: @gryptooo