OP City: Research and Optimization of OP Stack Deployments and Canon Fault Proofs VM

In recent years, the Ethereum ecosystem has faced significant scalability challenges, driving the need for innovative solutions that can optimize operation costs while maintaining the integrity and decentralization of the network. Among these solutions, the OP Stack and Canon Fault Proofs Virtual Machine (VM) have become critical components in the ongoing efforts to enhance the performance and efficiency of Ethereum Layer 2 rollups.

The OPcity stack is research made at Zenbit that delves into the theoretical and practical aspects of the OP Stack and Canon Fault Proofs VM, offering valuable insights into their implementation, performance benchmarks, and potential future enhancements. It comprises the contributions and insights needed to complete the first OP city milestones, and aims to summarize this document compilation made from December 2023 through 2024 as a starting point for researchers to study and test the rollup implementation from the OP stack and push its limits through the OP city stack.

1. OP Stack Theoretical Framework

From OPcity repository:

This section lays the theoretical groundwork for understanding the OP Stack and Fault Proofs, crucial components in the scalability solutions for Ethereum. We begin by examining the scalability issues inherent in Ethereum and how rollups, as Layer 2 solutions, are instrumental in addressing these challenges. The development of the OP Stack, along with the concept of the superchain and fault proofs, plays a vital role in optimizing operational costs for decentralized technologies. The discussion is supported by academic references and relevant URLs, providing a comprehensive overview of the theoretical underpinnings that drive these advancements.

Ethereum Scalability

Ethereum has emerged as one of the most important blockchain platforms, providing a decentralized network that has transformed how applications are developed and operated. Unlike traditional systems, Ethereum is an open-source, decentralized computing network that allows developers to create and run smart contracts, supporting a wide range of decentralized applications (dApps).

Ethereum operates as a state transition system, where transactions modify the global state of accounts and balances, powered by the Ethereum Virtual Machine (EVM), a versatile programming environment for executing complex smart contracts and dApps[1]. Smart contracts are Ethereum's standout feature. These self-executing agreements, encoded on the blockchain, enable automated and trustless interactions. They are widely used for custom tokens, financial products, Decentralized Autonomous Organizations (DAOs), file storage solutions, and more, with the blockchain's immutability and irreversibility ensuring these contracts operate with high integrity, resiliency, and transparency[2]. However, Ethereum faces the “blockchain trilemma”. This challenge involves simultaneously achieving scalability, decentralization, and security, but enhancing one aspect often compromises the others. Transactions per second (TPS) measure scalability, decentralization distributes control across the network, preventing censorship and manipulation, and security focuses on attack resistance[3].

Blockchain trilemma
Blockchain trilemma

Rollups as Layer 2 solution

Traditional blockchains like Bitcoin and Ethereum are secure and decentralized but have limited scalability. To tackle these scalability challenges, Ethereum has explored "simple techniques" as Layer 2 solutions that scale up by allowing different applications to operate on multiple chains interconnected through a crosschain communication protocol for interoperability.

While this approach maintains decentralization and improves scalability, it introduces potential security vulnerabilities. An attacker could potentially compromise the main chain by gaining control of a majority of consensus nodes on a single chain, causing collateral effects across other interconnected chains[4]. The primary security weakness lies in the verification process of blocks generated by these Layer 2 technologies. Unlike Layer 1, where blocks are fully verified, Layer 2 blocks only contain the parts of the state required to process the block, along with some hashes to prove that the provided state parts represent the claimed state in the block. This solution involves computation and data availability verification to ensure the system has performed computations correctly, assuming block validators have access to all required inputs and are stored accessibly for anyone to download if needed [5]. In the context of rollups, two technologies have emerged capable of processing both verifications with a proof mechanism as a security safeguard:

  1. Optimistic Rollups assume all transactions are valid by default and employ fraud proof to prove false transactions within seven days.

  2. Zero Knowledge (ZK) Rollups provide instant verification of transaction validity using validity proofs.

In optimistic rollups, fraud/fault proofs come into play if a dispute arises, detecting and proving that a specific block or transaction is invalid through merkalized data. This allows nodes in the network to receive and verify block states without downloading the entire blockchain, assuming at least one honest node is willing to generate these proofs[6]. This mechanism is also known as an interactive game [7], where a participant optimistically assumes that each proposed result is valid by default[8].

Optimism and the OP stack

Optimistic rollups, a type of L2 solution, aggregate multiple transactions into batches and post them on the Ethereum mainnet, significantly reducing the data and computation required on Layer 1[1]. This approach enables faster and more cost-effective scalability by assuming transactions are valid by default[9].

Optimism, a Layer 2 scaling solution for Ethereum, utilizes Optimistic Rollups to improve scalability, reduce transaction speed and costs, and maintain robust security[10]. The core of this ecosystem is OP Mainnet, an EVM-equivalent Layer 2 blockchain directly connected to Ethereum, very similar to the Ethereum mainnet, with only minor differences, primarily involving the specification of different network endpoints[11].

The OP Mainnet leverages the OP Stack, a modular, open-source software development stack maintained by the Optimism Collective. The OP stack facilitates the creation of Optimistic Rollups, and production-ready Layer 2 blockchains, allowing developers to independently customize and upgrade components such as consensus, execution, and settlement layers, fostering innovation and ensuring long-term adaptability.

Furthermore, OP Stack supports the Optimism Superchain, a network of interoperable OP Stack chains with common standards and protocols. This modular design facilitates for scalability, customization, interoperability, and innovation, helping to avoid the repeated development of similar software from scratch[12]. Introducing a permissionless Fault Proof System on OP Mainnet marks a significant step towards full decentralization, removing dependence on privileged roles for withdrawals and allowing any user to challenge and verify transactions, thus reducing reliance on centralized authorities.

As the OP Stack evolves, it will incorporate additional functionalities, such as more sophisticated fault-proof systems and enhanced interoperability features, paving the way for a robust, interconnected ecosystem of decentralized applications on Ethereum. The current iteration of the OP Stack, not only powers OP Mainnet but also simplifies deploying Optimistic Rollups and supports the Superchain concept. This vision aims to create a scalable and interconnected blockchain ecosystem, where multiple L2 chains can seamlessly interact while benefiting from shared security and development standards.

As the technology proves its security and reliability, the plan is to extend these capabilities, including fraud proofs, to other networks within the Optimism ecosystem, such as Base, Metal, Mode, and Zora. This will further solidify the Superchain concept and push the boundaries of Layer 2 scalability and interoperability [13].

From Optimism Docs

The OP Stack is a common development stack for building L2 blockchain ecosystems, built by the Optimism Collective to power Optimism. The OP Stack is best thought of as a collection of software components maintained by the Optimism Collective that either help to define new layers of the stack or fit in as modules within the stack.

  • 1 Data Availability

    The Data Availability Layer defines where the raw inputs to an OP Stack based chain are published. An OP Stack chain can use one or more Data Availability module to source its input data. Because an OP Stack chain is derived from the Data Availability Layer, the Data Availability module(s) used have a significant impact on the security model of a system

    Ethereum DA is currently the most widely used Data Availability module for the OP Stack. When using the Ethereum DA module, source data can be derived from any piece of information accessible on the Ethereum blockchain. This includes Ethereum calldata, events, and 4844 data blobs.

  • 2 Sequencing Layer

    The Sequencing Layer determines how user transactions on an OP Stack chain are collected and published to the Data Availability Layer module(s) in use. In the default Rollup configuration of the OP Stack, Sequencing is typically handled by a single dedicated Sequencer. Rules defined in the Derivation Layer generally restrict the Sequencer's ability to withhold transactions for more than a specific period of time

  • 3 Derivation Layer (Rollup / OP node)

    The Derivation Layer defines how the raw data in the Data Availability Layer is processed to form the processed inputs that are sent to the Execution Layer via the standard Ethereum Engine API(opens in a new tab). The Derivation Layer may also use the current system state, as defined by the Execution Layer, to inform the parsing of raw input data. The Derivation Layer can be modified to derive Engine API inputs from many different data sources. The Derivation Layer is typically tied closely to the Data Availability Layer because it must understand how to parse any raw input data.

  • 4 Execution Layer

    The Execution Layer defines the structure of state within an OP Stack system and defines the state transition function that mutates this state. State transitions are triggered when inputs are received from the Derivation Layer via the Engine API. The Execution Layer abstraction opens up the door to EVM modifications or different underlying VMs entirely.

    EVM

    The EVM is an Execution Layer module that uses the same state representation and state transition function as the Ethereum Virtual Machine. The EVM module in the Ethereum Rollup configuration of the OP Stack is a lightly modified(opens in a new tab) version of the EVM that adds support for L2 transactions initiated on Ethereum and adds an extra L1 Data Fee to each transaction to account for the cost of publishing transactions to Ethereum.

  • 5 Settlement Layer

    The Settlement Layer is a mechanism on external blockchains that establish a view of the state of an OP Stack chain on those external chains (including other OP Stack chains). For each OP Stack chain, there may be one or more Settlement mechanisms on one or more external chains. Settlement Layer mechanisms are read-only and allow parties external to the blockchain to make decisions based on the state of an OP Stack chain.

    The term "Settlement Layer" has its origins in the fact that Settlement Layer mechanisms are often used to handle withdrawals of ETH and tokens out of a blockchain. This sort of withdrawal system first involves proving the state of the target blockchain to some third-party chain and then processing a withdrawal based on that state. The Settlement Layer, at its core, simply allows a third-party chain to become aware of the state of the target chain.

    An Attestation-based Fault Proof mechanism uses an optimistic protocol to establish a view of an OP Stack chain. In optimistic settlement mechanisms generally, Proposer entities can propose what they believe to be the current valid state of the OP Stack chain. If these proposals are not invalidated within a certain period of time (the "challenge period"), then the proposals are assumed by the mechanism to be correct. In the Attestation Proof mechanism in particular, a proposal can be invalidated if some threshold of pre-defined parties provide attestations to a valid state that is different than the state in the proposal. This places a trust assumption on the honesty of at least a threshold number of the pre-defined participants.

  • 6 Governance Layer

    The Governance Layer refers to the general set of tools and processes used to manage system configuration, upgrades, and design decisions. This is a relatively abstract layer that can contain a wide range of mechanisms on a target OP Stack chain and on third-party chains that impact many of the other layers of the OP Stack.

    From OPcity Repository:

Interpretation of the OP stack layers and their function.
Interpretation of the OP stack layers and their function.

2. Node & Rollup Setup

From OPcity repository:

This section provides a comprehensive account of our hands-on experience setting up and running a node and deploying a rollup using the OP Stack. We conducted the setup process using two Intel NUC 13 PRO NUC13ANHi7 Arena Canyon devices, each equipped with a 13th-generation Intel Core i7-1360P CPU, 32GB RAM, and a 4TB SSD running Linux (Ubuntu).

We relied on a remote VM and a third-party RPC service in the December tests. However, this setup presented significant challenges, particularly regarding the limitations of RPC calls and the restrictions imposed by proprietary hardware. The use of third-party RPC services introduced latency and potential security concerns, which hindered the overall performance of our test environment. Furthermore, the reliance on a remote VM limited our control over the hardware, leading to issues in scalability and reliability during the testing phase.

To overcome these challenges, we transitioned to a home environment setup using the Intel NUC devices. This shift allowed us to bypass the limitations of third-party RPC services and gain full control over the hardware, leading to a more reliable and efficient testing environment. We documented the setup process in our GitHub repository, including all commands and screenshots, to guide others through the deployment process.

We documented the setup process into two main stages required to deploy a rollup on the Holesky testnet:

A. Spin up a L1 Node (Holesky Testnet):

This stage outlines the steps to configure an L1 node, starting with acquiring the necessary hardware. The Intel NUC devices provided the computing power required to handle the installation of Linux (Ubuntu), followed by the setup of Geth and Prysm dependencies. We then installed and started Geth, forming the Ethereum network's base layer, followed by the installation and initialization of Prysm, which facilitated the execution of the proof-of-stake consensus mechanism. This configuration ensured a robust L1 node capable of supporting subsequent rollup deployments.

Two-node setup for OP stack version benchmark
Two-node setup for OP stack version benchmark

B. Deploy a L2 rollup from the OP stack

Building on the foundation of the L1 node, this stage documents the deployment of an L2 rollup using the OP Stack. We began by installing the necessary Optimism dependencies and building the source code. We then proceeded to deploy the L1 contracts and initialize the OP-GETH instance, which involved setting up and executing key components such as op-geth and op-node. The process culminated in obtaining Holesky ETH on the L2 network and sending test transactions, successfully demonstrating the deployment of a rollup in a controlled home environment.

Holesky node running a rollup from the OP stack
Holesky node running a rollup from the OP stack

3. OP stack version benchmark

From OPcity repository:

This section presents the results of three test deployments conducted in December 2023 and two in June 2024. Our findings highlight a significant reduction in batcher and proposer operation costs when using the OP Stack V4.0.0 Canyon in December, compared to the dual deployment conducted in June 2024 with OP Stack V7.0.0. Fjord.

The main finding between versions is a notable reduction in the Total Gas Fees used by the rollup, with the V7.0.0 reducing ~75% of the operation cost compared with the V4.0.0 from December. The June benchmarks also compared calldata methods versus data blobs for registering data in Layer 1, revealing that data blobs consume approximately 60% fewer resources than call data.

Test Deployment 1 (OP stack V4.0.0 / December 2023)

From December 4th to 11th, 2023, we conducted an explorative evaluation of the operational cost performance of the OP stack within a testnet environment, focusing specifically on the gas spending by the batcher and proposer components. This testing phase was strategically scheduled before the Dencun, Ecotone, and Fjord upgrades, ensuring we were prepared for the advanced features such as span batches, data blobs, and fault proofs that were not yet implemented.

Test a OP rollup in a VM for 7 days

We have used a Virtual Machine in Google Cloud to deploy an OP rollup with the following configuration:

Machine type: e2-standard-2 Cores: 2 vCPU Memory: 8 GB

After connecting to a Sepolia node through Alchemy, installing dependencies, and building the Optimism Monorepo and op-geth, we created the four accounts that manage the rollup operation:

  • The Admin (0x…92D5) account can upgrade contracts.

  • The Batcher(0x…898ce) account, which publishes Sequencer transaction data to L1.

  • The Proposer(0x…B561) account which publishes L2 transaction results to L1.

  • The Sequencer(0x…3868) account signs blocks on the p2p network.

Along with these accounts a Bridge contract is created to transfer funds into the rollup.

Preliminary Findings

While investigating the operation of the OP stack as a potential solution to this issue, we faced multiple limitations relating to the cost of operation of the batcher contract and the constraints of relying on either a RPC-as-a-service or a Rollup-as-a-service platform. We found this issue during a 7-day rollup deployment from the OP stack in the Sepolia Testnet, where the main highlight is the high number of transactions generated by the batcher (~27,000) and gas spent on these (~9.5 ETH). This performance was only for the default rollup activity, without considering any additional user or contract interaction, which the limit in RPC may cause calls from the provider (alchemy) that trigger a chain reaction of empty blocks from the sequencer to the batcher, accelerating the transaction and gas fee rate.

Upon comparing the data with the OP Sepolia batcher, there is a significant difference in the number of transactions made during the same dates (~11,000 or 40% from the 7-day reference run) but not so with the higher gas spent (~12 ETH), which makes sense with the use of the testnet by a high volume of actual users and contracts. However, it may not be able to fulfill the Superchain vision of having multiple chains derived from the OP stack that may not have the same user volume or chains with high throughput primarily used for non-DeFi applications and do not rely on sequencer revenue to be profitable.

OP stack test deployment 01 / December 2023
OP stack test deployment 01 / December 2023

Test Deployment 2 (OP stack V7.0.0 - Call Data) / June 2024

From OPcity repository:

During the first test deployment, we identified that using a third-party RPC provider sets a limit on the network calls, generating issues in the batcher posting data correctly and increasing the gas fees cost due to RPC's lack of communication. To address this issue, we have set up a node in the office to provide a reliable source of RPC calls. This crucial step will allow us to observe and measure the impact on the number of transactions and the gas fees paid for each. The spin-up process is in the previous section and this OP City repository folder.

After setting up a Holesky node, we deployed a rollup from the OP stack in that testnet using the V7.0.0 Fjord and the calldata method to post transactions from the Batcher and Proposer. This test deployment lasted 20 days and occurred after multiple network updates, including the span batches, compatibility of data blobs, and other optimizations that notably impacted the rollup operation cost from the December test. Using the same data posting method from the December test, during the first seven days, the Total Gas Fees used on the default rollup operation transactions were reduced by ~75%, from 10.63 to 2.59 ETH, while the cost per day went from 1.5 to 0.37 ETH.

The data from the rollup addresses are available here:

OP stack test deployment 02 / June 2024
OP stack test deployment 02 / June 2024

Test Deployment 3 (OP stack V7.0.0 - Data Blobs) / June 2024

From OPcity repository:

While the reduction in the Total Gas Fees using Calldata is significant, the implementation of data blobs as a data posting method enables an alternative that can significantly impact the rollup operation cost. To prove it, we deployed a third testnet rollup a week later using the same V7.0.0.0 and data blobs as the data posting method to compare with. By comparing the performance of both configurations, we identified the following:

  1. A notable reduction in the number of transactions made by the Batcher and Proposer addresses with around 50% less activity (+100k txn with calldata vs 50k txn with data blobs)

  2. ~65% fewer gas fees used by the rollup (2.59 from calldata vs 0.64 ETH from data blobs per 7-days operation and 0.37 vs 0.09 ETH per day, respectively)

The data from the rollup addresses are available here:

OP stack test deployment 03 / June 2024
OP stack test deployment 03 / June 2024

4. Proposed changes to Fault Proofs

Proposed during the OP Governance Season 5 and grant finalist during cycle 22

We will research the compatibility of the OP stack's Canon Fault Proof VM with the opML's Multi-Phase Fault Proof protocol. The goal is to implement a custom Fault Dispute Game that manages the challenges related to data availability states from the L2 rollup and the computation results from the Deep Neural Networks (DNN) of the multi-phase opML. This could be possible with an incentive mechanism for node verifiers that resolve disputes for both technologies.

Proposed Modifications:

OpML uses a multi-phase fraud-proof to ensure the accuracy of machine learning results onchain. This mechanism is similar to experimental Canon Fault Proofs from the OP stack. Both technologies use a Fault Dispute Game to allow verifiers to resolve challenges on a game tree. Within this process, we aim to expand the merklized data on the OP stack FPVM to include the state transition in the opML Multi-Phase dispute game. By doing this, we aim to potentially achieve a unified framework capable of natively processing machine learning inferences onchain. By digging a bit deeper, we intend to find out if this framework’s implementation can help with the current specs or be an alternative version to the existing specs of FPVM.

A. State Transition Function Modeling

The FPVM functions as a state transition system where a function f maps a pre-stateSpreSpre to a post-state SpostSpost based on an executed instruction: 𝑓(𝑆𝑝𝑟𝑒)𝑆𝑝𝑜𝑠𝑡𝑓(𝑆𝑝𝑟𝑒)→𝑆𝑝𝑜𝑠𝑡

For integration:

  • Proposed Framework Modification: Introduce an additional layer that handles complex decision trees or neural network outputs, which adjusts how the state transitions are computed, especially in handling error states or exceptions.

  • Consider a modified state transition 𝑓(𝑆𝑝𝑟𝑒,D)𝑓(𝑆𝑝𝑟𝑒,D) function where DD represents data or decisions derived from opML processes, impacting the transition to SpostSpost.

  • Modified Function: f(Spre,D)Spostf(Spre,D)→Spost

  • Define a new state component that includes neural network inference results, which influences the transition process, particularly in how exceptions are handled.

B. Memory Management Analysis

Given the detailed memory specifications:

Heap and Memory Operations: Analyze the implications of integrating a mechanism for handling large datasets required by machine learning models directly within the memory structure of FPVM.

Suppose M(S)M(S) is the memory utilization state function. Introduce *M(S,D) M'(S, D) *to handle additional data structures or caching mechanisms to optimize ML data handling.

Memory Function: M(S)M(S,D)M(S) → M(S,D)

C. Syscalls and I/O Enhancements

The proposed framework could potentially extend the syscall and I/O functionalities to better support ML-driven data processing:

Extended Syscalls for ML: Introduce new syscalls specific to ML operations, such as data batching or model loading.

I/O Modeling: Adjust the I/O model to handle larger data streams efficiently, crucial for ML processes. Propose modifications like enhanced buffer management or asynchronous I/O operations.

D. Formal Verification and Error Analysis

Given the complexity of ML integrations and the critical role of fault-proofing:

Model how errors in the ML phase could propagate through the system, influencing state transitions and memory states.

Utilize formal methods to verify the correctness of the integrated system under various operational conditions, ensuring that the modifications do not introduce new vulnerabilities.

E. Simulation and Evaluation Metrics

Develop simulations that mimic real-world operational conditions to evaluate the effectiveness of the proposed modifications:

Create scenarios where traditional and ML-modified FPVMs are subjected to typical and atypical loads, measuring performance metrics like throughput, error rate, and response time.

Define specific metrics to evaluate improvements or regressions in system behavior due to the integration, such as memory efficiency, fault detection accuracy, and computational overhead.

Further, use this framework implementation to provide the infrastructure required to process onchain the high volumes of public data generated in cities and use onchain models to make ML inferences from those datasets. To protect the privacy of citizens' sensitive data, we are exploring the implementation of a zkML through the ORA's oppAI framework. This extension strategically balances the trade-offs between privacy and computational efficiency. By leveraging the strengths of zkML's privacy-preserving techniques and opML's computational efficiency, oppAI enables a hybrid model to optimize both aspects for onchain AI applications.

References

Ethereum and Fault Proofs

[1] Buterin, V. (2015). A Next Generation Smart Contract & Decentralized Application Platform.

[2] Karbasi, A.H., Shahpasand, S. A post-quantum end-to-end encryption over smart contract-based blockchain for defeating man-in-the-middle and interception attacks. Peer-to-Peer Netw. Appl. 13, 1423–1441 (2020).

[3] Werth, J., Hajian Berenjestanaki, M., Barzegar, H. R., El Ioini, N., & Pahl, C. (2023). A review of blockchain platforms based on the scalability, security and decentralization trilemma. Proceedings of the 19th International Conference on Web Information Systems and Technologies (WEBIST 2023).

[4] Buterin, V. (2021). Why sharding is great: demystifying the technical properties.

[5] Buterin, V. (2018). A note on data availability and erasure coding.

[6] Mustafa Al-Bassam, Alberto Sonnino, Vitalik Buterin: Fraud Proofs and Data Availability: Maximising Light Client Security and Scaling Blockchains with Dishonest Majorities. CoRRabs/1809.09044 (2018)

[7] Optimism. OP stack Specification.

[8] Conway, KD., So, C., Yu, X., & Wong, K. (2024). opML: Optimistic Machine Learning on Blockchain

OP Stack

[9] Poon, J., & Buterin, V. (2016). Plasma: Scalable Autonomous Smart Contracts.

[10] Buterin, v. (2021) An Incomplete Guide to Rollups.

[11] Optimism documents: Glossary.

[12] Getting started: OP Mainnet.

[13] Optimism Developer Blog. Protocol Development: The Fault Proof System is available for the OP Stack.

Subscribe to zenbit.eth
Receive the latest updates directly to your inbox.
Mint this entry as an NFT to add it to your collection.
Verification
This entry has been permanently stored onchain and signed by its creator.