Monad: Highly Performant, Parallelized, EVM Compatible L1

Cookies Research

0x7b4e

October 4th, 2023

Summary thread available here.

Key Takeaways

High Performance | Solid stats

10,000 TPS. 1 second block time. 1 second finality.
EVM Compatible | Building is a piece of cake

Full EVM Bytecode Compatibility: Applications built for Ethereum can be deployed on Monad with no code changes

Full Ethereum RPC Compatibility: User-facing infrastructure (e.g. MetaMask) can be used seamlessly
Unique Architecture | The core to high performance

2 main fundamental architecture employed by Monad to allow for high performance: (1) Parallel Execution (2) Superscalar Pipelining

1. Introduction

The blockchain landscape has constantly been evolving and we have seen numerous contestants attempting to improve upon Ethereum’s architectures. Teams provide more sophisticated technology aimed at attracting more builders to create better protocols, with the eventual goal of gaining more users. Despite all the competition, Ethereum has remained as the dominant blockchain, with the greatest amount of total value locked (TVL) residing in the ecosystem.

DefiLlama: Total Value Locked (TVL) on All Chains

Due to the scalability issues faced by Ethereum (i.e. network congestion, high gas fees etc.), many teams have come forth to build their solutions, contributing to the competitive blockchain landscape. Some of the more prominent contestants to Ethereum include:

Ethereum virtual machine (EVM) layer 1 (L1s) including Tron, BSC, Avalanche, as well as non-EVM L1s including Solana, Aptos, Sui and many more
Layer 2s (L2s), with Arbitrum and Optimism currently leading the race, alongside more optimistic (i.e. Base, Mantle etc.) and zero-knowledge (ZK) L2s (i.e. zkSync, Starknet etc.) being built

Blockchains tend to strive for EVM-compatibility, which allows for applications built on Ethereum to easily port over while maintaining a familiar user experience. This is viewed as important for teams as it serves as a gateway to access the mammoth TVL available in Ethereum.

To have a better understanding of the changing blockchain landscape, you can refer to this really well written article by Xpara from Four Pillars Research.

Monad joins the playing field, building a L1 aimed at optimizing the efficiency of blockchains beyond what EVM chains can currently achieve. This article will be diving into Monad’s architecture, with technical concepts explained with graphics as far as possible, and a detailed look at the benefits brought about.

2. Monad’s Architecture

The key to Monad’s high performance lies it its parallel execution, to which superscalar pipelining is a complementary concept.

Pipelining: Breaking down an instruction set into successive steps. In turn, these steps can be executed concurrently (in parallel).

Let’s understand pipelining through the example below, where the task of laundry has been split into 4 successive steps: (1) Wash (2) Dry (3) Fold (4) Store. As a result of this, the steps can now be carried out in parallel. When the first basket of clothes are placed into the dryer after washing, the washer becomes available for basket 2.

Now that we have understood parallel execution and superscalar pipelining, we can move on to learn more about Monad’s consensus and execution mechanisms.

3. Monad’s Consensus

There are 4 key areas to Monad’s consensus, here’s a brief introduction before diving deep into each:

(a) MonadBFT: Consensus mechanism
(b) Shared Mempool: Block propagation through hashes
(c) Deferred Execution: Decoupling execution and consensus
(d) Carriage Cost and Reserve Balance: Prevent spam from deferred execution

3.1 MonadBFT

Definitions

Byzantine Nodes: Malicious nodes
Quorum Certificate (QC): 2f + 1 validators (majority) votes for ‘yes’
Timeout Certificate (TC): 2f + 1 validators (majority) signed timeout messages

Default Assumptions

Total Number of Nodes: 3f + 1
Maximum Number of Byzantine Nodes: f
Non-Byzantine Nodes Required: 2f + 1

Flow

Let’s understand the MonadBFT mechanism with a specific example.

(1) round k

Leader Alice will send out a new block (k) and a QC from the previous round k - 1
Validator nodes will review the block (k) for adherence to protocol rules → Vote
Signed votes will be sent to the leader in the next round k + 1

(2) round k + 1

Leader Bob will propose block (k + 1) and the QC from round k
Validator nodes will review the block (k + 1) for adherence to protocol rules
Validator nodes see the QC for round k in block (k + 1)
- But Alice’s block (k) might not be finalized yet
- This is because the leader Bob might be malicious and have sent block (k + 1) to less than 2f + 1 validators
Validator votes → Signed votes will be sent to the leader in the next round k + 2

(3) round k + 2

Leader Charlie will propose block (k + 2) and the QC from round k + 1
At this point, upon QC from round k + 1, validators can affirm that block (k) has been finalized

In summary, a block takes 2 rounds to be finalized. The first is a consensus on the transactions within the block. The second is a validation of the consensus. The graphic below depicts the flow:

3.2 Shared Mempool

As of this current stage, block propagation on Ethereum is still a significant bottleneck. When block leaders proposes blocks that are of big sizes, the propagation of these blocks to the validator network for consensus requires extremely high bandwidth from validator nodes, resulting in slower block propagation.

With MonadBFT, instead of propagating the entire block for consensus from the validator network, it propagates blocks using its hash. You might be wondering, if there is only a hash available, how will nodes know the details of the block which they are voting on for consensus? Let’s take a more detailed look.

User transactions that are submitted (but not yet executed) go to the RPC node. The pending transactions are then forwarded to validators’ local mempool. Individual validators might each have a certain set of transactions that are included in the block. In order for validators to know what is the exact block they are voting on, they need to know all transactions included in the block. This can be achieved through:

(a) Erasure-coding Transactions: Individual validators can share their respective pieces of transactions with each other

(b) Broadcast Tree: This is an efficient communication method for validators to share the various pieces of transactions

With the above 2 mechanisms, it becomes possible for validators participating in the consensus to reconstruct the entire block referred to by the hash given by block leader.

Let’s take a look at the potential improvement that can result from block propagation using hashes:

Traditional Block Propagation: If a block has 10,000 transactions, with each transaction having a size of 500 bytes, the total block size will be 5MB. This size can be even bigger should the transactions be more complex, and thus having a bigger size
Monad’s Block Propagation: Hashes are only 32 bytes in size. Instead of having to propagate all 5MB worth of data, straining the various stakeholders (validator notes, block leader, network), block propagation becomes orders of magnitude lighter and more efficient

P.S.: Erasure-coding has not been implemented yet as of this stage of development.

🍪’s thoughts: Despite all the benefits mentioned, particularly with regards to efficiency, it remains to be seen whether the architecture might result in possible attack vectors (perhaps through broadcast tree) or inaccuracies in consensus.

3.3 Deferred Execution

Before we begin looking into deferred execution, here are some definitions:

Consensus: Nodes agree on official ordering of transactions
Execution: Transaction executed and state is updated
Gas Usage: Different types of transactions (differing levels of complexity) use different amounts of gas
Gas Limit: Total amount of gas that can be consumed for each block
Block Time: Will be set based on the gas limit
- Time Budget: Gas limit allocation

Currently on Ethereum, execution has to be completed before consensus.

The gas limit on Ethereum has to account for:

(1) Execution by block proposer
(2) Execution by validator nodes
(3) Consensus of validator nodes

As a result, the time budget for execution becomes extremely limited, given that there needs to be sufficient time for both executions and consensus. This puts a cap on the number of transactions that can be executed in one block, limiting the scalability of Ethereum when there is high network volume / large amount of complex transactions.

With Monad, execution takes place after consensus, as explained in the above section 3.1 MonadBFT. This means that during consensus, nodes will agree on the official ordering of transactions, but neither the leader nor validator nodes will be required to execute the transactions at this point. This also goes to say that when the leader proposes an ordering and validator nodes vote on the ordering, both stakeholders do not yet know the result of the transactions (whether they can be executed (e.g. sufficient gas) or will be reverted).

The key concept in such an architecture is: The true state is determined upon official ordering of transactions. Monad leverages on this, which is why in MonadBFT, nodes are able to come to consensus on ordering of transactions in round k, and concurrently execute transactions from round k - 1.

In this scenario, the entire gas limit of a block will be dedicated to execution, allowing Monad to increase the number of transactions that can be executed in each block, achieving its high throughput.

However, there is a potential issue associated with deferred execution.

As mentioned, execution is carried out after consensus. This results in a situation where nodes participating in consensus do not have an up-to-date view of the state (the nodes don’t know whether the transactions can actually be executed). In such a case, there is the possibility of certain users spamming the network, attempting to get their transactions to be included in consensus, causing other transactions to be pushed out. To have a better understanding, you can refer to the bakery analogy below:

3.4 Carriage Cost and Reserve Balance

To tackle the potential of spam and denial-of-service attacks, Monad has implemented carriage cost and reserve balance.

(a) Carriage Cost

For a transaction to be included into a mempool and subsequently included into a block for consensus, users will be charged a carriage cost (can be thought of as a deposit). This serves as a cost to deter users from spamming the network.

(b) Reserve Balance

Carriage costs are deducted from the reserve balance of each account. Upon successful execution of transactions, carriage costs will be returned to users (with a delay). It is targeted that the reserve balance will be ~200x that of the carriage cost. This is to allow honest users to submit multiple transactions concurrently / subsequently. This is necessary to maintain composability of transactions for DeFi applications, where transactions need to be executed one after another.

🍪’s thoughts: Just a random thought, but, what are the chances of carriage cost being outsourced. It becomes a sort of ‘service’ where instead of each account having a reserve balance, there is a smart contract that holds balance for all other accounts. Something of a similar mental model could be paymasters in ERC-4337. The main idea here is to maintain capital efficiency by not requiring users to have idle assets sitting in the reserve balance.

4. Monad’s Execution

One of the key architectural difference that allows for Monad to achieve high performance is its parallel execution. In this section, we will take a brief look at the mechanism and necessary component to achieve this: MonadDb.

4.1 Parallel Execution

With parallel execution, multiple cores and threads are utilized to execute transactions in parallel, while still committing the results in the original order. This means that the end result obtained from parallel execution (e.g. account balances) will be the same as the sequential execution.

An analogy we can think of would be the drive-through to Shake Shack

Serial Execution: There’s only 1 lane available for drive-through. Drivers 2 and 3 will have to wait for driver 1 to finish ordering and collect his order before they are able to start their order.
Parallel Execution: With parallel execution, there are now more than 1 lane available in the drive-through. All 3 drivers can order concurrently, and will still be able to get their respective order. This is the exact same outcome as serial execution, except that the speed at which customers receive their burgers will be increased.

Specifically in the context of Monad, optimistic execution is utilized. This means that the network can start executing transactions before earlier transactions in the block have completed. Results from optimistic execution will be akin to that of serial execution of transactions.

Nonetheless, it should be noted that with optimistic execution, there might sometimes be incorrect execution. This is especially the case when the transactions executed in parallel include the same accounts. This graphic below depicts how an incorrect execution might happen.

Optimistic Execution: The Problem of Incorrect Execution

Routing back to the Shake Shack drive-through analogy above, imagine if all 3 cars that were waiting in line originally wanted a cheeseburger. However, there is only 1 cheeseburger left. The following graphic depicts what would have happened if serial execution was utilized.

Serial Execution: The Shake-Shack Drive-through

Essentially, cars 2 and 3 won’t be able to make an order for cheeseburgers as the system has already been updated after car 1 makes and receive their order. Should it have been a parallel execution, with all 3 cars ordering concurrently, the system will register all orders. However, due to the lack of cheeseburgers, the 2 other customers that didn’t manage to receive their order will have to re-order.

Parallel Execution: The Shake-Shack Drive-through

Thus, it is important for Monad to be able to capture such instances of incorrect execution and re-execute it with correct data. This is done by checking for the condition shown in the graphic below:

4.2 MonadDb

One critical architecture that allows for optimistic execution to take place is the MonadDb, which is Monad’s custom database for storing blockchain state.

Particularly, optimistic execution is made possible with asynchronous I/O, a form of input / output processing to allow for concurrent execution while communication is in progress. MonadDb fully utilizes the latest kernel support to achieve asynchronous I/O.

Assuming there are 2 transactions, A and B:

Execution of transaction A begins when the input is entered
Execution of transaction B can begin even without output from A
- This is feasible due to asynchronous I/O

5. Monad’s Benefits

5.1 Performance

The table below compares the TPS, block time and block finality between Ethereum and Monad. As shown, Monad features a much higher TPS, alongside with 1 second block time and finality.

With these metrics, Monad tackles the scalability issues that have been plaguing Ethereum. This allows more users to be accommodated on the Monad network, and maintains low fees as far as plausible for users.

5.2 Portability

Monad features high portability due to the following architecture:

Full EVM Bytecode Compatibility
Full Ethereum RPC Compatibility

With bytecode-equivalent to Ethereum (all Ethereum opcodes is supported), applications built on Ethereum can be ported to Monad without requiring any code changes. Developers will be able to port their existing work, develop with ease given the familiarity and users will be able to access their desired applications (presuming that the protocol ports over). In addition, infrastructure e.g. MetaMask, Etherscan and others can be used seamlessly on Monad.

6. Conclusion

Monad represents a significant leap forward in blockchain technology. The combination of high performance and EVM compatibility opens doors for developers and users alike, ensuring a seamless experience. With parallel execution, the potential of blockchain scalability paints an exciting landscape of protocols that can be built.

With that being said, there are still aspects that will likely undergo more iterations to become even more efficient. One example would be the carriage cost and reserve balance, where it takes time to study transaction data before being able to analyze what might be a suitable amount to charge for carriage cost, and how much the reserve balance should contain to sufficiently counter against denial of service attacks.

If you would like to see more of such works, do follow me on Twitter @jinglingcookies and join my Telegram group here where I share my daily reads. For more quality research, I recommend my good friends over at Four Pillars who share research reports in various verticals: Modularity, L1s, Games and more!