You may notice how many people wonder why such a complicated solution was adopted for Ethereum scaling, maybe you’re even one of them. Like, why “rollups” if there are many other options, from Solana-style vertical scaling to block expiry and execution sharding?
In the past I was a huge fan of execution sharding and strongly opposed rollup scaling as too complex and meaningless if execution sharding exists. In response I always received non-exhaustive “it’s inefficient”. Time flies, and after reading dozens of topics, articles, rationales, sharding and danksharding designs, I can say that I changed my mind about it. I want to make a summary of rollup scaling from the point of comparison with other scaling designs, and show exact numbers and facts about their effectiveness to the same doubting people as me a year ago.
Special thanks to Domothy for feedback and review
Theory of scaling
Ethereum chain is slow.
It’s slow because all nodes have to execute all transactions happening on-chain themselves. All computers with running Ethereum EL+CL clients should download and re-execute all 18.5 million (at the time of writing) blocks and all transactions inside of them, and after they’re done, they do the same with newly created blocks, which are created every 12 seconds.
With this system it’s impossible to make weaker node execute less computations than stronger one. If on-chain computations are only within the power of stronger node, weaker node simply won’t keep up with the chain, eventually leaving the network only with powerful nodes. It follows that the entire network operates at the speed of one node of this network. The load can’t be distributed between the nodes.
It can be assumed that if the solid network can’t distribute the load between nodes, we must make node requirements as high as possible. By that we can make the network much faster, since modern enterprise-grade servers can let us process hundreds of thousands of transactions.
This is called vertical scaling - we make single instance of the system more powerful to increase capacity of the whole system. Among platforms that adopted this type of scaling - Solana.
However, this type of scaling is not perfect. By increasing node requirements, you respectively reduce the number of people that can afford running their own node. If ordinary users of the network can’t run the network themselves, they have to trust ones who do - block explorers, commercial projects, dApps, etc. Trust implies huge risks on the part of simple users, because they become easily vulnerable to malicious actors in the network infrastructure. Cryptocurrency as an idea strives for refusal of any type of trust by verifying your funds and other elements of the network that you may need yourself, which isn’t really possible with vertical scaling mechanism. That’s why Ethereum community refused this type of scaling since the launch and instead looks for other designs.
Ok. Let’s assume that we have hundreds or thousands of lightweight nodes that operate the solid chain on low speed. What stops us from making each node execute its own set of transactions, effectively distributing the load between all existing nodes?
One of old scaling mechanisms that followed this logic was sharding, or execution sharding. The mechanism works like this:
We make many parallel chains that are interoperable and coordinated by beacon chain that provides staking and controls validators. Then, we distribute all existing validators between all these chains, so each validator secures the chain it’s assigned to. Looks quite simple and natural - why should we all run the same chain if we can split into many groups and run our own chains that are interoperable?
Among the chains that adopted this mechanism - NEAR. At the time of writing, the network isn’t sharded due to low demand, but sharding system is integrated in protocol and may be used in the future.
What is a rollup
I’ll just assume optimistic rollups don’t exist because they’re pretty much obsolete and it’ll be much simpler to explain without them
Another scaling design which is currently chosen by Ethereum community is rollups. Rollups are separate blockchains that prove their state with transactions on the solid Layer 1 network, in our case it’s Ethereum chain. Each rollup has two way bridge built in the system, so they can interact with L1 and therefore with each other.
Re-executing all rollup transactions on L1 would be too expensive and thus would have no point, so rollups use other ways to prove their transactions’ validity without actually running them on L1. One of the ways that recently gained popularity and is recommended for new rollups is through zero knowledge proofs.
What’s zero knowledge proof? I’ll take an explanation from my previous article, because it’s useful here as well:
Well, my explanation will not be exhaustive at all, because it entirely sounds as a magic and to understand how it works internally, you should have strong knowledge in cryptography. ZK (Zero-Knowledge) proof is a set of mathematical equations that allow you to prove outputs of some computations without executing these computations yourself, either because they’re too hard for you to execute or because you don’t have some of necessary values to execute it.
In our case, we know all necessary values (rollup posts them on the L1 blockchain), but they’re too hard for L1 to execute so we want to avoid re-execution, that’s why we use ZK proofs.
The important characteristic of ZK proofs is that it’s hard to generate them (much harder than computations that they prove) but extremely simple (in comparison to computations that they prove) to validate. That’s because in order to generate a ZK proof to some computations, you have to “convert” all these computations to lots of mathematical equations. By that, we can move computation burden from all nodes to some sequencers with powerful machines, while nodes can enjoy low load.
Drawbacks and disadvantages (what’s worse?)
What’s the limitation of execution sharding? Let me explain using Ethereum as an example.
Currently the Ethereum network is secured by 28.2 million ETH. To attack it, you should burn 1/3 of it to prevent the chain from finalising and 2/3 of it to rollback blocks.
If we take USD and the price of 2050 USD per ETH, it’s 58 billion dollars of economic security for single chain.
All these validators secure the single chain. Let’s imagine that we create 100 shards and distribute validators between them. If previously we had 880k validators for the single chain, now we have 8.8k validators per each shard, or 580 million dollars of economic security, or at least 174 million dollars to attack one shard.
If one shard is secured by significantly less funds than it contains, it becomes profitable to attack the shard.
Therefore, sharded network can’t contain too many shards, because then their economic security will be too weak. It turns out that it’s impossible to really distribute the load among each network entity (validator node), because validators have to be divided into huge enough groups for network to remain secure.
We will rely on a number of 64 shards, because it’s used in most sharding specs and defining specific number will greatly help us with future calculations.
Ok. What about rollups? Let’s look at a lifecycle of rollup batches.
Currently, all data of L2 transactions is stored on L1.
Some rollup proposer (or proposers if we talk about decentralised rollups) sends the compressed batch of these transactions on L1 through calldata,
then some rollup sequencer (or sequencers) reads this batch from proposer's transaction on blockchain, re-executes it, generates a ZK proof of this batch’s validity and sends this proof on the smart contract on L1.
Smart contract executes the proof, and if it’s valid, then the batch that this proof proves is valid by definition. Smart contract updates canonical merkle root (very roughly speaking, hash of everything) of the state and then rollups nodes can keep up with the chain according to this root in the contract on L1.
Do you see a bottleneck here? Don’t worry, i’ll help you.
Storage in Ethereum blockchain is limited. I mean, very limited. How much?
One non-zero byte (you compress all zeroes down, right?) of calldata storage costs 16 gas. Ethereum block has soft limit of 15m gas. if we assume that 1/4 of all gas of the block is spent only on rollups data, we’ll be able to fit 375k/16=~229 KB of data per block.
I’ll take zkSync’s average transaction size of 177 bytes, so we get ~1324 transactions per block or ~110 TPS. This is obviously not much. And these transactions would be really expensive, because Ethereum block demand is always high, and you’ll have to pay really high fee to take 1/4 of all blockspace just for your rollup.
If Ethereum blockspace is limited, we probably have to somehow spend it more efficiently to fit more transactions? This is already made by many rollups using compression.
Wait, why do we even have to store rollups’ batches permanently, if in the end rollup nodes just listen to the latest root from the L1 smart contract?
If you already have this question, you’re right!
All rollups nodes by design use the rollup smart contract on L1 as a source of trust. If the contract isn’t buggy and L1 isn’t attacked (and it probably isn’t), then everything the contract says is truth. Therefore, thanks to the blockchain, everything the contract ever said is the truth too.
Scroll back to the scheme above. All transactions data is only used once - when sequencer takes it to generate the proof to the state they made. Then proof to the batch is verified and the system does not need these transactions anymore, because smart contract already accepted their execution in the rollup state.
Wait, doesn’t smart contract need the transaction data to execute the ZK proof to it? No! Everything the contract needs is a commitment to the data.
Explanation for people that don’t know anything about ZK cryptography: commitment is a really short data that is derived from the main data and is used to execute ZK proofs that belong to this data. However, commitment can’t be used to generate proofs, only to verify. That is, we don’t need smart contract to have the data, we can just give it the short commitment and we’re good.
Yeah, I know, it’s the freaking witchery, just believe that it works, if you want, you can read about it from people more techy than me later.
The only thing we should do is to make sure that sequencers have all necessary data to generate the proofs until they generate and send them to L1.
Sounds interesting! What if we make something on Ethereum that stores the data only as long as it’s needed?
We do! Currently Ethereum developers work (and they’re almost done!) on EIP-4844 update which enables blob transactions. Blob is some big amount of data which is stored separate from the blockchain, so it can be deleted later. The blockchain only stores the hash of the commitment to the blob, so sequencers can reveal the commitment to the contract and use it to prove the data inside the blob.
Blobs are only stored for 4096 epochs, or 131 072 blocks, or ~18.2 days. During this time, sequencers will obviously have time to generate the proof, many proving systems are already capable of generating proofs in a few minutes. And we even have some time for optimistic rollups that have to wait for 7 days, but we’ll leave it for now :)
Sounds cool! Let’s calculate how much transactions it allows us to make.
One blob is 128 KB in size, EIP-4844 (blobs spec) specifies conservative 3 blobs soft limit per block. We’ll keep 177 bytes per transaction average from above, so we get: 128 * 1024 * 3 / 177 = ~2221 transactions per block or ~185 TPS.
It’s worth noting that unlike blockspace, blobspace will only be used by rollups, so these transactions will be cheaper than blockspace ones. Also, 3 blobs soft limit was chosen for testing purposes and will be increased in the future. I expect this limit to be increased 3-4 times short-term, but I’m not a dev, so we’ll use these initial values for future computations.
Well, 185 TPS is better than 110, but still not really much. Is there anything we can do to increase this space? Hmm… What if I told you that we can divide this data into relatively small validator groups, so the system can store more data? :)
Yes, you got me right. I talk about sharding, but with data. This mechanism is called danksharding.
I won’t go into technical depths of this mechanism, especially when it’s not fully done yet and is generally quite complicated, so we likely won’t see it in the next 2-3 years, but it’s something Ethereum community actively works on, and blob transactions are obviously one of required steps for danksharding.
Basically the logic is very similar to execution sharding:
We split all validators into some sort of groups and distribute all incoming blobs to these groups equally.
They must store their blobs for certain amount of time. If any blob becomes unavailable (all validators in the group lost or pretend to lost it), these validators get slashed (deprived of the part of stake and kicked out)
Validators outside of this group can make sure that the data isn’t lost by asking committee for random small parts of the blob and applying some mathematical computations to them. If these points are valid, the whole blob is considered available, since it’s practically impossible for theoretical malicious actors to generate fake blobs that have same points of data as original blob. This mechanism is called Data Availability Sampling (DAS)
Wait, what? How is third part possible? Well, it’s ZK magic as well. You can read a paragraph below, but make sure that your brain won’t boil:
Remember how I told you that we can use commitment to the blob to ZK prove its contents in EVM? To make it possible, blobs were not just made as a chunk of bytes, but an array of mathematical points. All these points are provable through the commitment to the blob, therefore the more points you ask for, the less chance of malicious actor to generate all malicious points that are considered valid by the commitment.
This is very rough explanation that doesn’t do in depths, but the logic is like that. In fact, we don’t even need to know all the tech. You can read Blobspace 101 by Domothy to get a deeper techinical understanding of how it works.
So, basically danksharding is just fancy mechanism of distributing blobs to relatively small commitments of validators, so each validator don’t need to store all data. This way, the network can handle much more blobs. I’ll take the number of 192 blobs per block, because by that we form equal amount of ephemeral “groups” (64) both for execution sharding and data sharding.
Reveal your numbers!
Let’s summarise rollup scaling.
There is a blockchain whose transactions must be confirmed by Ethereum blockchain.
Since Ethereum network nodes are run on Raspberry Pi’s and such weak garbage for maximal possible decentralisation, we can’t just re-execute all transactions of L2 blockchain on L1, it would simply be too hard.
Instead, we take one, two, five, doesn’t matter, expensive powerful clusters that execute extremely hard computations to generate small cryptographic proofs to transactions of our L2 blockchain.
Then, we send this proof to special smart contract on Ethereum that verifies this proof by certain cryptographic rules. It’s much easier just to validate the proof than to re-execute these transactions, that is, we in fact move all computation burden to few powerful clusters, leaving Ethereum blockchain unloaded.
By that, we get a blockchain that gets all security guarantees of Ethereum (because its validity is proven on it), but keeps all necessary load off Ethereum.
However, we have to store our transaction data for some time so sequencers (these powerful clusters) can download the data and generate the proof to it. We don’t need the smart contract to have this data, since for ZK proof short commitment to data is enough.
Storing all this data which is needed for an hour or two on the ledger that will outlive my grandchildren is a huge (and too expensive) overkill. So, we make special cells of data (blobs) that are deleted after 18 days.
But these blobs are still not enough (they give us ~185 TPS by my computations) so we’re also going to make an update that will split all validators to some amount of groups and distribute all newcoming blobs to them. This way we can store significantly more blobs than if each validator kept all blobs.
Then, let’s summarise execution sharding scaling.
Validators are weak because they work on cheap node hardware such as Raspberry Pi’s. Because of it, the Ethereum network can only handle, say, 15 TPS.
We split all validators into 64 groups and make them build their own chains. All these chains are coordinated by a single beacon chain, which is run by all nodes and therefore makes all chains interoperable.
What do these designs have in common? It may seem that they’re completely different, but in the end we still split all validators into groups, each of which performs its own task.
The difference is, in execution sharding these groups handle execution - smart contracts, accounts, transactions, just as in the normal Ethereum chain. In danksharding, they handle temporary data for rollups.
And now, finally, side to side comparison.
Rollups: In EIP-4844 where the whole network handles 3 blobs per block, it will handle data enough for ~185 TPS, as we calculated earlier. But, if we take 64 “groups” and the same load per validator (3 blobs per block), danksharding would handle 192 blobs per block and therefore will allow rollups to perform 185*64 = ~11 840 transactions per second.
It’s worth noting that for small microtransactions such as gaming or tips it may be useful to use Validiums and off-chain data availability solutions, such as Celestia, EigenDA or Near DA. These solutions will allow us to perform much more transactions than this, while these ~12k TPS can be used for more serious payments, where full Ethereum security is necessary.
Also, blob limits will slowly be increased as tech improves. EIP-4844 limits were chosen as temporary to test the system and fix potential bugs on release. This means that with 2x increase we’ll have ~24k TPS, with 5x increase - ~60k TPS, and so on.
I believe that over time rollups teams will improve their compression mechanism as well, it’s generally a big field for potential improvements.
Sharding: It’s hard to approximate TPS limit for the single Ethereum chain since transactions take different computations - some of them are token transfers, trades on Uniswap, NFT trades, etc etc. But I think current TPS is quite accurate metric, since current demand always covers supply (all gas is spent) and it reflects average gas consumption per consumer transaction.
So, we have 12 TPS on the single chain, and we would have 64 parallel chains in execution sharding. Let’s add 3 more TPS, since we assume that rollups wouldn’t exist and therefore wouldn’t consume any gas. 15 * 64 = ~960 transactions per second.
So, we get at least ~12.3x difference in performance in favor of rollups.
By efficiency I mean how much work can be distributed among entities in the network.
In monolithic chain, no work is distributed. Each node has to execute all computations in the network. This way, the network can only work at the speed of the single instance of it.
In execution sharding, work can be distributed by splitting all validators into some amount of committees each working on its own chain. However, this distribution lowers economic security of each shard, because each shard has less validators than the monolithic network, and this security lowers proportionally with amount of shards.
People usually take the amount of 64 shards, because then each shard is secured by somewhat reasonable 1/64 of total stake = currently ~440k ETH = ~900m$. It means that distribution capacity doesn’t depend on amount of entities of the network, but on economical security. This way, the network can’t work at the speed of more than 64 instances of the network.
In rollup scaling with danksharding, the most computation burden is on some powerful entities that generate ZK proofs to rollup batches. All rollups work at the speed of proof generation. If proofs only for 20 transactions are generated per second, the network will handle 20 TPS, and so on.
However, proof generation speed is increased with each instance of sequencer system/s. The more provers there are, the more proofs are generated per second. The more proofs are generated, the higher is capacity of rollups. The only restriction is data availability which has practically unreachable limits mid-term.
It means that in rollup scaling, network scales with each new sequencer instance.
One of important technical differences of rollups and execution sharding is that while execution shards fully resemble the logic of Ethereum, rollups can be programmed to work on basically any existing stack, virtual machine, language, etc.
There are different types of zkEVMs, ZK rollups with ZK-friendly virtual machines that fully or partially resemble logic of EVM.
Some of them, such as zkSync Era, provide only language-level compatibility, which means that you can use Solidity or Vyper for your zkSync smart contracts.
Another rollups, such as Polygon zkEVM and Scroll, implement EVM compatibility, which means that you can re-use all existing infrastructure, including compilers, for their rollups, but they will have higher fees due to more prover complexity.
Taiko, Type 1 zkEVM, resembles all logic of Ethereum blockchain including gas system and state trie. Because of that, it’s expected that their fees will be the highest among all zkEVMs, but that’s an option for those who need full equivalence with Ethereum tech.
There are also non-EVM rollups. Eclipse works on Solana VM, Fluent develops zkWASM, M2 is going to use Move VM which is used by Aptos and Sui, L1 blockchains. Starknet uses its own unique Cairo VM.
Besides virtual machines, there are also solutions with different DA systems. If you need full Ethereum security, you might want full rollups such as zkSync or Scroll. If you’re ok with data availability solutions based on external consensus, you can use Eclipse based on Celestia DA or Fluent based on Near DA. If you’re ready to trust single entity in exchange for near-zero fees, Validiums are your choice.
Different rollups have different decentralisation levels. zkSync currently works on switching to decentralised sequencing, which will be built in protocol. Taiko will be released with decentralised sequencing from day 0. Solutions such as Scroll and Starknet are using centralised sequencers, which may also be an option if you don’t want to take risks related to complex sequencing protocol or want to have pre-confirmations from single entity.
In turn, all shards of execution sharding resemble the same logic and work the same. You won’t be able to use different programming languages and VMs, have pre-confirmations, different gas tokens, in-built account abstraction, etc etc. Everything is Ethereum, nothing more, nothing less.
However, this diversity comes at the cost which I will cover later.
ELI5 - How hard is it to attack the system, what are the losses from an attack and how easy is it to recover from an attack?
In case of execution shard, it’s just as vulnerable to finality (51%) attacks as the current Ethereum chain, which means that if certain shard is attacked, attacker can prevent shard from finalisation or roll back certain blocks, essentially stealing money. In case of attack on data availability, the worst that can happen is that rollups won’t be able to continue the chain and will likely require to roll back unconfirmed batches through their governance.
However, attack on execution shard and data availability looks completely different.
Technically, it’s much easier to store data than to run parallel execution layer. Instead of keeping up with the chain, possibly monitoring PBS, leader election, and all consensus burden, you just download some MBs of data and share with people when needed. Because of that, there would be more people storing all existing blobs from all committees than running all parallel chains. More RPCs, explorers, infrastructure solutions could afford to store everything for their products.
Moreover, unlike execution shard, data has 1/N trust assumptions. As long as any single entity stores required blob, it’s available, because you can prove that certain blob is valid with its commitment. Therefore, it becomes really hard to perform a censoring attack on blobs, because you have to somehow keep target blobs from all archive nodes. Just imagine - you can run a single archive node and you make any attack on blobs in the network impossible by oneself.
With execution sharding it’s obvious. All shards are interoperable by design, because they coordinate through single beacon chain and belong to the same consensus mechanism. What about rollups?
It’s completely possible to implement rollup interoperability protocol, however, such protocol requires fast finality, because we need to receive messages from L1 in the matter of blocks and be sure that message from one side won’t be reversed after its execution on the other side.
Modern proof systems already allow us to generate ZK proofs in seconds on consumer-grade PCs, but we have to remember that currently all rollups are in early stages and have “training wheels” - certain restrictions needed for fast and easy recovery of the chain in case of a bug or an exploit. One of these popular training wheels is proof delay. As example, Scroll delays batch finality by 30 minutes and zkSync delays it by ~24 hours. It’s obvious that rollups can’t effectively interact with each other with such delays.
So, in order to make rollups interoperable, we should wait until they leave their testing phase, which will probably take a long time. Why?
In execution sharding, all shards are considered part of the Ethereum consensus. It means that if some of them contain a bug, or are attacked, it’s considered a fault of Ethereum consensus and therefore is a subject to a community fork.
It means that safety of users is guaranteed by community that will fork the chain in case of consensus fault. As example, if there was a bug that allowed malicious actor to steal ETH from people’s balances, community would make a fork update to fix this bug.
Rollups, in turn, are made by independent developers and do not count as the part of Ethereum protocol. This is not as big problem as if you had a sidechain that you can’t fork, because in rollups you at least don’t have a (attackable) consensus, but bugs still exist, and due to general complexity of modern ZK rollups, bugs are not a rare thing.
Currently rollups deal with potential bugs by running centralised sequencers (provers), having special upgradeable contracts, creating security councils that can stop the rollup in case of emergency, and other restrictions that help users not worry about security of their assets, but seriously harm trustlessness and decentralisation of protocols. These training wheels can’t last long, that’s why rollups teams spend the most money and time on bug bounties and audits in the whole ecosystem, but of course even that can’t make them sure about security of protocols that are expected to handle trillions of dollars and the whole world financial system. If they’re hacked, there’s nothing Ethereum community and consensus can help them with.
One of solutions that Aleks Gluchowski (zkSync) has suggested is the system of on-chain Courts in Ethereum. Basically the idea is to make some sort of community court that can fork L1 in case of a bug in some big protocol (DeFi or L2) important to Ethereum ecosystem. This is definitely a controversial idea, because it questions the whole “code is law” philosophy
(which was already violated once) and in fact brings Ethereum back to the web2 world of human vices - corruption, corporation or government propaganda, etc etc.
Another, more technical solution is to enshrine some standardised zkEVM inside of Ethereum protocol in a form of precompile. The advantage of this idea is that we still have fork guarantees of L1, but they aren’t related to problems of certain rollup or Dapp. As long as bridge with proposing and sequencing works normally in the rollup, everything else is secured by L1 consensus.
However, its serious problem is that it doesn’t support any rollups that are different from standardised zkEVM, for optimisation reasons or because they work on a different VM. While there are ZK rollups that follow the philosophy of “as close to Ethereum as possible” such as Scroll or Taiko, rollups such as zkSync Era have their own version of EVM-like virtual machine with many features and peculiarities that don’t exist in canonical EVM, such as account abstraction and system contracts.
We can summarise this comparison with the table below:
While being native to Ethereum tech, execution sharding can’t scale to the levels of rollups and nearly isn’t as secure as rollups with danksharding. Danksharding, being a data sharding mechanism, is probably the most secure way to distribute data since it doesn’t rely on honest majority, blobs can be stored by anyone and single person can prevent any censoring attack on blobs this way.
In crypto, we don’t like “it works if nothing breaks”, but we have to admit that in comparison with execution sharding, danksharding is pretty much impossible to break in normal circumstances. And even if it’s broken, rollups don’t risk stolen funds, because consequences of an attack are easy to recover from, especially with well-built governance system. We still have a monolithic system secured by 60 billion dollars under the hood!
Rollups can allow developers to make their rollups fully based on their needs and easily configurable. They can create anything, from custom sequencing rules or account logic to a custom VM or a Linux runtime inside of a rollup. Rollups open actual blockchain parallelisation by allowing each entity, even if it’s your home PC, to contribute to execution of transactions in the network.
However, this all comes at a cost: rollups can’t have security guarantees of Ethereum consensus, so their development is really hard, expensive and takes long time. The work is going, and I think that in a year or two, we’ll already start to see fully interoperable rollups with shared liquidity and smart wallets that abstract the whole “rollups” term on UX side, so we can enjoy basically monolithic experience, but with distributed efficiency and bulletproof decentralisation.
Thank you for reading.