Danksharding and Modular Narrative - Deep into Data Availability

Danksharding is the future of Ethereum scalability; this article will discuss the future of Danksharding and modular blockchain.

Have to say, the name Dankshard is cool.

In the process of scaling, there are two problems:

  1. Scalable verification of computation:

    Efficiently verify the results of computations instead of re-executing them by full nodes. In the Rollup-centric Roadmap of Ethereum, roll up is responsible for using fraud proofs or ZK proofs to implement high-throughput secure transaction processing capabilities.

    Of course, there is no denying that such a system will be used in the L1 system to add "native" high-throughput execution in the future, just like what PSE and Scroll are doing now.

  2. Scalable verification of data Availability:

    Efficiently verify data availability instead of downloading all data by full nodes. While ensuring consensus, L1 also needs to provide a "data availability engine" to ensure that rollup data is available, usually to ensure that Batch + State is available.

    If L1 DA cannot guarantee, it will result in:

    1. OPR: Challenge cannot be submitted (State + Batch data is invalid)

  3. ZKR: Unable to get current state (State)

Transaction Progress
Transaction Progress

Scalable verification of data Availability is usually more complex, and fraud proofs cannot be used directly because it is difficult to judge who is wrong when a challenge happens - the fisherman's dilemma.

Fortunately, technologies such as DAS, Polynomial Commitments, Erasure Code, etc. can help us with scalability verification for data availability problems.

Against this background, the following solutions emerged:

Native: Danksharding

Others: Validium\zkPorter\Celestia\Polygon Avail

This article will mainly analyze Danksharding.

Danksharding - all for the future

1. Traditional Sharding and Danksharding

The Traditional Sharding design divides validators into different committees, and different committees verify different parts of data on independent subnets.

Danksharding has innovated. Unlike the previously fixed number of shards, where each block has different proposers, danksharding adopts the PBS mechanism(proposer/builder separation). Only one proposer chooses the transactions and data that enter the slot and hand them over to the block builder.

The main transaction type of Danksharding is the "blob-carrying transaction" which refers to a transaction that carries additional blob data (~125kb/blob). Compared with calldata, blobs are cheaper, and EVM executes but cannot access blobs (only commitment can be accessed).

Since the execution block and the sharding block are built together, there is no delay in shard block confirmation, and there is no need to track shard blob confirmations – making data immediately visible to L1s.

Traditional Sharding VS Danksharding
Traditional Sharding VS Danksharding

Old:

  1. Data shards
  2. DAS(Erasure Code + KZG Commitments)

New:

  1. PBS + Crlist
  2. KZG 2D scheme

2. Proto-Danksharding

Proto-danksharding (aka. EIP-4844) is a proposal to implement most of the logic and “scaffolding” (e.g transaction formats, verification rules) that make up a full Danksharding spec but not yet actually implementing any sharding. In a proto-danksharding implementation, all validators and users still have to directly validate the availability of the full data. - Vitalik Buterin

EIP-4844 will be implemented in the Shanghai hard fork.

Proto-Danksharding adds a new transaction type - "Blob Transaction"

Blob Transaction
Blob Transaction

Blobs have 128 KiB of storage space:

  1. A transaction contains up to 2 blobs, i.e. 256 KiB
  2. A block contains up to 16, i.e. 2 MiB; the target is 8, i.e. 1 MiB

Under Proto-Danksharding, 2.5TB of historical data will be added every year, and 40TB will be added after full sharding. The EIP-4444 proposal is essential; after the node synchronizes the blob TX, the blobs part will be deleted after a while and only the blob_versioned_hash will be retained.

Note: Data availability does not equal data retrievability. To maintain blockchain consensus, we only need to ensure data availability, and data retrievability is unnecessary. For historical data storage issues, see A step-by-step roadmap for scaling rollups with calldata expansion and sharding.

Blob data no longer occupies data storage permanently, and "cache space" is usually cheaper. Based on this, the multi-dimension gas market (EIP-1559) is created. The transaction fees will be determined by both Gas and Blob.

multi-dimension gas market (EIP-1559)
multi-dimension gas market (EIP-1559)

Proto-danksharding (does a lot of work for full sharding) vs. Full Danksharding. All remaining work is consensus layer changes that don't require any additional work by client teams, users, or rollup developers.

This is why I wrote "all for the future" in the subtitle. The underlying architecture does not require significant changes regardless of the subsequent ZK verification or PBS design upgrade.

Proto-danksharding lays an open foundation for the future of Ethereum.

3. Full Danksharding

Full Danksharding will implement PBS + Crlist + DAS.

First, the blob space has increased: Max 2 MiB, Target 1 MiB → Max 32 MiB, Target 16 MiB.

The node bandwidth will be challenging to support because the data becomes larger, so we use PBS + Crlist to build blocks and DAS to verify data availability.

Step by step in depth.

PBS + Crlist - Reduce block builder bandwidth

In Danksharding, the bandwidth requirements for building blocks are relatively high. Firstly, you need to download 32MB of blob TX, and secondly, you need to broadcast blocks for validators to sample.

The core idea of PBS is to separate Builder and Proposer:

Proposers = Validators

  1. Select transaction list, create and broadcast Block Header.
  2. Honest majority assumption. With a small amount of data and low bandwidth requirements. Ideally, the validator can be completely stateless.

Builders = Separate role

  1. Create and broadcast a list of transactions i.e. Block Body
  2. Honest minority assumption. With a large amount of data and high bandwidth requirements. More centralized

The resource-intensive work is left to the Builders to complete and make the MEV more democratic. For decentralization, we implement a hybrid PBS design (there may be a better design in the future). The following are the steps to produce blocks:

hybrid PBS design
hybrid PBS design
  1. Proposer publishes a crList (censorship resistance List) to Builders. The list consists of all transactions in the Proposer Mempool.
  2. Each Builder selects transactions from crList, sorts and extracts MEV until the Block Gas Limit is filled, and creates an exec block body.
  3. Builders publish their transaction list hash to the Proposer; the Proposer accepts the highest bid exec block body, builds it into a Block Header, and broadcasts it; the Proposer (and everyone else) does not learn the contents of any exec block body until after they select the header (and hence the body) that wins the auction.
  4. The network synchronizes the block header from the Proposer and the block body from the selected Builder.

DAS - Reduce validator bandwidth

DAS is an issue that has been discussed for years. Ensuring data availability without nodes downloading all the data is key to blockchain scale.

This involves erasure coding and KZG commitment, which are common techniques in DA solutions.

Erasure coding

Specific principle:

By quadrupling the original data through 2D coding, any 75% of the data can reconstruct 100% of the data. This turns the 100% data availability problem into a 75% problem.

If miners want to hide data of any size, they need to hide at least 25% of the data. Each node only needs to sample a fixed number of samples to ensure that 75% of the data is available, and then the entire block can be reconstructed.

With Erasure coding, validators can verify blocks more efficiently. However, it is also necessary to ensure that miners encode the data correctly.

KZG Commitment

The function of KZG Commitment is similar to the Merkle root. The difference between KZG Commitment is that all points are guaranteed to be on the same polynomial, and Erasure Coding can be achieved by adding Point, which is very friendly.

Note: KZG Commitment is 48 bytes, and EVM uses 32-byte values more naturally, so convert KZG Commitment to versioned hash:

convert KZG Commitment to versioned hash
convert KZG Commitment to versioned hash

All for the future, in the long term, for quantum-safety reasons, if we switch from KZG to something else (e.g. Merkle trees + STARKs), roll up will not require EVM-level changes; just switch the sequencers to the new Transaction Type.

At the same time, add a new OPcode OPcode GET_VERSIONED_HASH_OPCODE, which is input as a stack argument index.If index < len(tx.header.blob_versioned_hashes returns tx.header.blob_versioned_hashes[index], otherwise returns 0.

Generating ZKG Proofs is very time-consuming. If correct, it takes 100s to generate KZG Proofs for 32MB Data, and if it takes 1s to complete, it takes 100 Codecores. Currently, the Ethereum team is investigating CPU implementation.

Research to validate acceleration is also underway: Optimizing EIP-4844 transaction validation (using KZG proofs) #5088

Now the execution client can only access the blob versioned hash but not the blob data at execution time, which is an exciting innovation.

Polygon Avail also uses KZG Commitment, and Celestia uses Merkel tree + fraud-proof verification.

There will be a trade-off between cost and time rate here, but I'm personally optimistic about the mass adoption of Proof-of-Validity in DA in the long term. There may also have some upgrades in the future of Celestia.

💡 The DAS-based design is still being optimized, and you can submit your ideas here: requests-for-proposals

Learn more about DAS and Erasure code:

Lazy Validator Problem

This is an open question. If a node just signs the proof of all shard blobs without actually downloading the data, it will still receive rewards and save storage and bandwidth costs, requiring some penalty mechanisms. But how to balance the relationship between decentralization and node sampling still needs further improvement.

4. Roll up and Danksharding

Danksharding added two precompiles: Blob verification precompile and Point evaluation precompile.

blob verification precompile
blob verification precompile
Point evaluation precompile
Point evaluation precompile

It is simple for OPR; just use the blob verification function and the previously submitted Versioned hash for fraud proof verification at the time of fraud proof.

It will be tricky for ZKR because a submission transaction needs to provide a proof that operates directly over the data. Using ZK-SNARKs to prove that the data in the shard matches the commitment on the beacon chain would be very expensive.

A more ingenious solution is commitment proof of equivalence protocol; the principle is straightforward. Use point evaluation precompile to prove that KZG commitment and ZK rollup's own commitment point to the same data.

This is what Danksharding is all about, Danksharding is not just an optimization of sharding, but innovation, and countless "adjacent possibilities" will be opened.

5. The future of Modular

The design of Roll up and sharding makes the monolithic blockchain gradually obsolete. Modular architecture has become more mainstream. We solve the scale problem of the blockchain by modularizing the consensus layer, DA layer, and execution layer.

I like this article from Polynya: Processors & blockchains: modular is revolutionary.

The modular stack opens up the imagination of blockchain design. There are also many new solutions for DA.

For example, Celestia is similar to an open modular component; any chain can use it to ensure DA; Celestia has its nodes and consensus, but does not process transactions, only ensures data availability through data availability sampling. (Celestia's article will be published recently)

Roll up can use Ethereum as settlement layer and Celestia as DA layer, publish TXdata to Celestia instead of Ethereum. Celestia will return the verification result through the DA bridge contract on L1.

Many new components can also be created based on Celestia, such as Sovereign Rollup and Settlement Rollup. Celestia will become a plug-and-play open component (not only in the Ethereum ecosystem).

Celestia Modular World
Celestia Modular World

Many other solutions have also appeared in L2, such as Validium, zkPorter, Polygon Avail, and the Permissionless DAC that Starkware is researching. How to safely coordinate these solutions in a system is now being tried and broken.

I think Ethereum will be the best choice for settlement and DA in the long term. It provides a highly decentralized base layer on which any project can be built, including centralized-production systems. With the development of Volitions, different data solutions will emerge. The choice of future DA solutions will be handed over to users, and users can choose the data solutions they need by themselves - modularization will bring infinite possibilities.

The monolithic blockchain narrative is fading, and Modular Blockchain will be the future.

Record learning, welcome to DM🍻

If you want to see the current Calldata size situation, my friend made this dashboard on dune:

Subscribe to Yicheng
Receive the latest updates directly to your inbox.
Verification
This entry has been permanently stored onchain and signed by its creator.