In addition to the native scaling solutions mentioned in Part I, another path for Bitcoin scaling involves the establishment of an additional protocol layer on top of Bitcoin, known as Layer 2. The most critical aspects of Layer 2 solutions are the secure bi-directional bridges and the inheritance of Bitcoin's consensus security.
The concept of sidechains dates back to 2014 when Blockstream submitted "Enabling Blockchain Innovations with Pegged Sidechains." It represents a relatively basic approach to scaling.
A sidechain is a blockchain that operates independently from the mainchain, with its own consensus protocol, and can serve as a testing ground for innovations on the mainchain. When adverse events occur on a sidechain, the damage is confined entirely to the sidechain itself, without any impact on the mainchain. Sidechains can employ consensus protocols with higher TPS (transactions per second), enhance on-chain programmability, and facilitate the enhancement of BTC.
Sidechains can facilitate the transfer of Bitcoin between different blockchains through either a two-way peg or a one-way peg. However, in reality, BTC can only reside on the Bitcoin mainnet, so a anchoring mechanism is needed to link BTC on a sidechain with BTC on the Bitcoin mainnet.
A one-way peg requires users to send BTC from the mainnet to an unusable address to be burned, after which an equivalent amount of BTC is minted on the sidechain, but this process is irreversible. A two-way peg is an advancement over the one-way peg, allowing BTC to move back and forth between the mainchain and the sidechain. Instead of burning by sending to an unusable address, a two-way peg locks the BTC through multisig or other control scripts, minting new BTC on the sidechain. When users want to return to the mainnet, the BTC on the sidechain is burned, and the originally locked BTC is released on the mainnet.
The implementation of a one-way peg is much simpler compared to a two-way peg, as it does not need to manage related states on the Bitcoin mainnet. However, sidechain assets created through a one-way peg might be worthless because they lack a reverse anchoring mechanism.
For verifying lock transactions on the main chain and burn transactions on the sidechain, there are different schemes and security levels. The simplest involves external verification by multisig participants, but this carries a high risk of centralization. A better option is to use SPV Proof for decentralized verification. However, due to the lack of necessary programming capabilities on the Bitcoin mainnet, SPV verification cannot be performed, and other methods, typically multisig custody, must be used.
The main issues criticized about sidechains include:
Asset cross-chain reliance on validators: Since the Bitcoin mainnet still cannot implement smart contracts, cross-chain asset transfers cannot be managed through trustless contract logic. Returning assets from a sidechain to Bitcoin requires reliance on a group of validators, introducing trust assumptions and fraud risks.
Sidechains cannot inherit main chain security: As sidechains operate completely independently from the mainnet, they cannot inherit the security of the mainnet, potentially leading to malicious block reorganizations.
To address these, sidechain approaches include reliance on authorities (federated), economic security (PoS), decentralized Bitcoin miners (Merged Mining), and hardware security modules (HSM). Custody of funds on Bitcoin and block production on the sidechain can be managed by different roles, introducing more complex security mechanisms.
One of the earliest forms of sidechains is the federated sidechain, which relies on a pre-selected group of entities to act as validators, who are responsible for custodying assets on the main network and producing blocks on the sidechain.
Liquid is a representative example of a federated sidechain, with 15 participating parties acting as validators. The management of private keys is not disclosed, and validation requires 11 out of 15 signatures. The block production on the Liquid sidechain is also maintained by these 15 participants. The low number of nodes in this federation allows for a higher Transactions Per Second (TPS), achieving scalability objectives, with its primary application being in DeFi.
However, the federated sidechain model presents significant centralized security risks.
RSK is also managed by 15 nodes that custody main network funds, with validation requiring only 8 signatures. Unlike Liquid, RSK's multisig keys are managed by Hardware Security Modules (HSMs), and peg-out instructions are signed based on Proof of Work (PoW) consensus, preventing validators with key access from directly manipulating the custodied funds.
In terms of sidechain consensus, RSK uses Merged Mining, leveraging the main network's hash power to secure transactions on the sidechain. When a substantial proportion of the main network's hash power is used for merged mining, it effectively prevents double-spend attacks on the sidechain. RSK has improved upon Merged Mining to ensure the security of the sidechain under low hash rates by using a fork-aware approach that intervenes in off-chain consensus on fork behaviors, reducing the likelihood of double-spending.
However, Merged Mining alters miner incentives and exacerbates the risks of Miner Extractable Value (MEV), potentially destabilizing the system. Over time, Merged Mining could contribute to increased centralization in mining.
Stacks anchors its chain history to Bitcoin by committing the hash of its sidechain blocks into Bitcoin blocks, achieving the same finality as Bitcoin. Forks in Stacks can only occur if Bitcoin itself forks, enhancing its resistance to double-spend attacks.
sBTC introduces a new token and incentive model, utilizing a staking bridge that allows up to 150 main network validators. Validators need to stake STX tokens to obtain permissions to approve deposits and withdrawals. The security of the staking bridge heavily depends on the value of the staked assets, which poses a risk to the cross-chain security of BTC during significant price fluctuations of the staked assets.
Other sidechain proposals are currently being widely discussed in the community.
One of the most notable is the Drivechain proposal by Paul Sztorc in 2015, which has allocated the key technologies in BIP 300 (pegging mechanism) and BIP 301 (blind merged mining). BIP 300 defines the logic for adding a new sidechain, similar to activating a new sidechain via miner signaling like a soft fork. BIP 301 allows Bitcoin miners to become block producers of the sidechain without needing to verify the specifics of transactions.
Bitcoin miners are also responsible for approving withdrawal transactions. They initiate a withdrawal proposal by creating an OP_RETURN output in the coinbase transaction of the blocks they mine. Other miners can then vote on this proposal by supporting or opposing it in each block they mine. Once a withdrawal transaction surpasses a threshold (13,150 blocks), it is executed and confirmed on the Bitcoin main chain.
In reality, miners have complete control over the funds on the Drivechain. If funds are stolen, users can only resort to a User-Activated Soft Fork (UASF) for self-rescue, which is challenging to achieve consensus on. Moreover, the unique position of miners in Drivechain increases MEV risks, as already evidenced in Ethereum.
Spacechain takes a different approach by using a Perpetual 1 way peg (P1WP), where users burn BTC to obtain tokens on Spacechain, bypassing the issue of fund security entirely. These tokens are solely used to bid for block space on Spacechain, lacking any value storage function.
To ensure the security of the sidechain, Spacechain utilizes blind merged mining, where users bid openly using ANYPREVOUT (APO) for the rights to construct blocks. Bitcoin miners only need to commit to the Spacechain block headers in their blocks, without needing to verify the sidechain blocks. However, the launch of Spacechain requires support for Covenants, and the Bitcoin community is still discussing the necessity of a soft fork to add Covenant opcodes.
Overall, Spacechain aims to achieve a sidechain with the same decentralization and censorship-resistance as Bitcoin, along with increased programmability through its block auctioning feature.
Softchain is another two-way peg (2wp) sidechain proposal by Ruben Somsen, utilizing a PoW FP consensus mechanism to secure the sidechain. Under normal circumstances, Bitcoin full nodes need only download the block headers of the softchain to verify the proof of work. In the event of a fork, they download the orphan blocks and corresponding UTXO set commitments to verify the block's validity.
For the 2wp mechanism, a deposit transaction is created on the main chain during peg-in, and softchain references this main chain transaction to access funds; during peg-out, a withdrawal transaction is created on softchain, and the main chain references this transaction to retrieve BTC after a lengthy challenge period. The specific peg-in and peg-out mechanisms require soft fork support, thus the proposal is named Softchain.
Softchain's proposal imposes additional verification costs on Bitcoin's main network full nodes, and a consensus split within Softchain could potentially affect the consensus on the main network, posing a possible attack vector on Bitcoin.
The Lightning Network white paper was released in 2015, and it officially launched in 2018. As a Layer 2 peer-to-peer payment protocol on Bitcoin, it aims to shift a large volume of small, high-frequency transactions to off-chain processing. It has long been considered the most promising scaling solution for the Bitcoin network.
The implementation of the Lightning Network relies on several important modules within Bitcoin, which together ensure the security of transactions on the network.
Firstly, there are pre-signed transactions. These became securely usable after the SegWit upgrade. SegWit separates the signatures from the rest of the transaction data, solving potential issues like transaction malleability, third-party and second-party transaction tampering. The security of off-chain calculations in the Lightning Network is guaranteed by an irrevocable commitment provided by the counterparty, which is executed through pre-signed transactions. Once a user receives a pre-signed transaction from the counterparty, they can broadcast it to the blockchain at any time to fulfill the commitment.
Next are multi-signatures. Frequent off-chain fund transfers between two parties require a medium that both parties control, hence the need for multi-signatures, typically using a 2-of-2 scheme. This ensures that fund transfers can only occur with the mutual consent of both parties.
However, 2-of-2 multi-signatures can lead to a liveness problem, where if one party does not cooperate, the other cannot move any funds from the multi-signature address, resulting in a loss of the original funds. Time locks can solve the liveness issue; by pre-signing a contract with a time lock that returns the funds, it ensures that even if one party becomes inactive, the other can still recover the initial funds.
Finally, hash locks serve to connect multiple state channels, building a network effect. The pre-image of the hash acts as a means of communication, coordinating the correct operation among multiple entities.
To conduct transactions using the Lightning Network, both parties first need to open a bidirectional payment channel on Bitcoin. They can perform an unlimited number of transactions off-chain, and upon completion of all transactions, they submit the latest state to the Bitcoin blockchain to settle and close the payment channel.
Specifically, the implementation of the payment channel involves the following key steps:
Create a multi-signature address. Both parties first need to create a 2-of-2 multi-signature address to act as the funding lock for the channel. Each party holds a private key for signing and provides their public key.
Initialize the channel. Both parties broadcast a transaction on-chain to lock a certain amount of Bitcoin into the multi-signature address, serving as the initial funds for the channel. This transaction is known as the "anchor" transaction of the channel.
Update channel state. While making payments within the channel, both parties exchange pre-signed transactions to update the state of the channel. Each update generates a new "commitment transaction," representing the current distribution of funds. Commitment transactions have two outputs, corresponding to the fund shares of both parties.
Broadcast the latest state. Either party can broadcast the latest commitment transaction to the blockchain at any time to withdraw their share of the funds. To prevent the other party from broadcasting an outdated state, each commitment transaction is accompanied by a corresponding "penalty transaction," which allows one to claim all the other's funds in case of cheating.
Close the channel. When both parties decide to close the channel, they can cooperate to generate a "settlement transaction" and broadcast the final distribution of funds to the blockchain. This releases the funds locked in the multi-signature address back to the individual addresses of both parties.
On-chain arbitration. If the parties cannot agree on closing the channel, either party can unilaterally broadcast the latest commitment transaction to initiate an on-chain arbitration process. If there are no disputes within a certain period (e.g., one day), the funds will be distributed to both parties according to the allocation in the commitment transaction.
Payment channels can be interconnected to form a network that supports multi-hop routing through the use of HTLCs (Hashed Time-Locked Contracts). HTLCs operate with a hash lock as the direct condition and a time-locked signature payment as the fallback condition, allowing users to interact based on the pre-image of the hash before the expiration of the time lock.
When there is no direct channel between two users, payment can be completed using HTLCs across routed paths. Throughout this process, the pre-image of the hash, R, plays a crucial role in ensuring the atomicity of the payment. Additionally, the time locks in HTLCs are set to decrease along the route, ensuring that each hop has sufficient time to process and forward the payment.
Fundamentally, the Lightning Network circumvents the external trust assumptions associated with asset bridging through peer-to-peer state channels, while utilizing time-lock scripts to provide the ultimate safeguard for assets, offering fault protection. This allows for unilateral exit in situations where the counterparty loses activity and does not cooperate. Therefore, the Lightning Network holds high utility in payment scenarios, but it also has several limitations, including:
Channel Capacity Limitations: The capacity of payment channels in the Lightning Network is limited by the initial funds locked, which cannot support payments exceeding the channel’s capacity. This may restrict some use cases, such as large commodity transactions.
Online and Synchronization Requirements: To timely receive and forward payments, nodes in the Lightning Network need to remain online. If a node is offline for an extended period, it might miss some updates on channel states, leading to desynchronization. This can be challenging for personal users and mobile devices, also increasing the operational costs for nodes.
Liquidity Management: The routing efficiency of the Lightning Network depends on the liquidity distribution among channels. If funds are unevenly distributed, some payment paths may become ineffective, affecting user experience. Managing the liquidity balance of channels requires certain technical and financial resources.
Privacy Concerns: To find viable payment paths, the routing algorithms of the Lightning Network need to understand some extent of channel capacity and connectivity information, which could reveal user privacy, such as fund distribution and trading counterparts. The opening and closing of payment channels might also expose information about the participants involved.
The initial concept of the RGB protocol was inspired by Peter Todd's ideas of client-side validation and single-use seals. It was proposed by Giacomo Zucco in 2016 and is a scalable, privacy-preserving second-layer protocol for Bitcoin.
The verification process in blockchain involves broadcasting blocks composed of transactions to the entire network, allowing every node to compute and verify the transactions within those blocks. This effectively creates a public good, where the nodes across the network assist each individual who submits a transaction with verification, with users providing BTC as a transaction fee as an incentive for this verification. Client-side validation is more individual-centric, with state verification not performed globally but by the individuals involved in a specific state transition. Only the parties generating the transactions validate the legitimacy of these state transitions, significantly enhancing privacy, reducing the burden on nodes, and improving scalability.
Peer-to-peer state transitions pose a risk where, without access to a complete history of state transitions, users can be defrauded, leading to double-spending. Single-use seals were proposed to address this issue. By using a special object that can only be used once, they ensure that double-spending does not occur, thereby enhancing security. Bitcoin's UTXO (Unspent Transaction Output) model is the most suitable form of a single-use seal, protected by Bitcoin's consensus mechanism and network hash power, allowing RGB assets to inherit Bitcoin's security features.
Single-use seals need to be combined with cryptographic commitments to ensure that users are clearly aware of state transitions and to prevent double-spending attacks. A commitment informs others that something has occurred and cannot be altered later, without revealing the specifics until verification is needed. This can be achieved using hash functions. In RGB, the content of the commitment is the state transition, signaled to the recipient of RGB assets through the spending of a UTXO. The asset recipient then verifies the commitment against the specific data transmitted off-chain by the spender of the assets.
RGB utilizes Bitcoin's consensus to ensure double-spend security and censorship resistance, while all state transition verification tasks are delegated to off-chain, performed only by the client receiving the payment.
For issuers of RGB assets, creating an RGB contract involves initiating a transaction where the commitment to specific information is stored in an OP_RETURN script within a Taproot transaction condition.
When the holder of an RGB asset wishes to spend it, they need to obtain relevant information from the recipient of the asset, create an RGB transaction, and commit the details of this transaction. The commitment is then placed into a UTXO specified by the asset recipient, and a transaction is issued to spend the original UTXO and create a new UTXO as specified by the recipient. When the asset recipient notices that the UTXO storing the RGB asset has been spent, they can verify the validity of the RGB transaction through the commitment in the Bitcoin transaction. Once verified as valid, they can confidently acknowledge the receipt of the RGB asset.
For the recipients of RGB assets, the payer must provide the initial state and rules for state transitions of the contract, each Bitcoin transaction used in the transfer, the RGB transactions committed by each Bitcoin transaction, and evidence of the validity of each Bitcoin transaction. The recipient's client uses this data to verify the validity of the RGB transactions. In this setup, Bitcoin's UTXO acts as a container that holds the state of the RGB contract. The transfer history of each RGB contract can be represented as a directed acyclic graph (DAG), and the recipient of an RGB asset can only access the history related to their held assets, not any other branches.
Compared to the complete verification required by the blockchain, the RGB protocol significantly reduces the cost of verification. Users do not need to traverse all historical blocks to obtain the latest state; they only need to synchronize the history relevant to the assets they receive to verify the validity of transactions.
This lightweight verification makes peer-to-peer transactions easier and further reduces reliance on centralized service providers, enhancing decentralization.
The RGB protocol only needs a hash commitment to inherit Bitcoin's security and uses Taproot scripts, which almost do not consume additional Bitcoin blockchain space. This allows for complex asset programming to be possible. Using UTXO as containers, the RGB protocol naturally supports concurrency; RGB assets on different transfer branches do not block each other and can be spent simultaneously.
Unlike typical protocols, only the recipients of RGB assets can access the history of asset transfers. Once spent, they cannot access the history of future transfers, significantly ensuring user privacy. Transactions of RGB assets and the transfer of Bitcoin UTXOs are not linked, making it impossible for outsiders to trace RGB transactions on the Bitcoin blockchain.
Moreover, RGB supports the blinding of outputs, which means that the payer cannot determine which UTXO the RGB assets will be paid into, further enhancing privacy and resistance to censorship.
When RGB assets change hands multiple times, new asset recipients may face a considerable verification burden to validate a lengthy transfer history, potentially resulting in longer verification times and losing the ability to confirm transactions quickly. For nodes operating in the blockchain, because they are always synchronized with the latest state, the time taken to verify state transitions upon receiving new blocks is actually limited.
The community is discussing the possibility of reusing historical computations, and recursive ZK Proofs could potentially achieve constant time and size for state verification.
Rollup is the best scaling solution for the Ethereum ecosystem, derived from years of exploration from state channels to Plasma, and finally evolving to Rollup.
A Rollup is an independent blockchain that collects transactions off the Bitcoin chain, batches multiple transactions, executes them, and commits the batch data and state commitments to the main chain. This achieves off-chain transaction processing and state updates. To maximize scalability, Rollups typically use a centralized sequencer at this stage to enhance execution efficiency without compromising security, as the security is ensured by the main chain's verification of Rollup state transitions.
As the Ethereum ecosystem's Rollup solutions mature, the Bitcoin ecosystem has also begun to explore Rollups. However, a key difference between Bitcoin and Ethereum is the lack of programming capabilities, which makes it impossible to perform the necessary computations for building Rollups on-chain. Currently, efforts are focused on implementing sovereign Rollups and OP Rollups.
Rollups can be divided into two main categories: Optimistic Rollups and Validity Rollups (ZK Rollups), with the main difference being the method of state transition verification.
Optimistic Rollup uses an optimistic verification method. During the dispute period after each batch of transactions is submitted, anyone can check the off-chain data and raise objections to problematic batches by submitting fraud proofs to the main chain, resulting in penalties for the Sequencer. If no valid fraud proof is submitted during the dispute period, the transaction batch is deemed valid, and the state update is confirmed on the main chain.
Validity Rollup uses Validity Proof for verification. The Sequencer uses a zero-knowledge proof algorithm to generate a concise validity proof for each batch of transactions, proving that the state transition of that batch is correct. Each update requires the submission of a validity proof of the transaction batch to the main chain, which verifies the proof and confirms the state update immediately.
The advantage of Optimistic Rollup is its relative simplicity and minimal modification to the main chain. However, its drawback is longer transaction confirmation times (dependent on the dispute period) and a higher requirement for data availability. Validity Rollup has the advantage of fast transaction confirmation, independence from dispute periods, and the ability to keep transaction data private. However, generating and verifying zero-knowledge proofs require significant computational overhead.
Celestia has also proposed the concept of a sovereign Rollup where the Rollup's transaction data is published to a dedicated Data Availability (DA) layer blockchain, which is responsible for data availability, while the sovereign Rollup itself handles execution and settlement.
Bitcoin-based Rollups are still in the early stages. Due to the differences in the accounting model and programming language from Ethereum, it is challenging to directly replicate Ethereum's practices. The Bitcoin community is actively exploring innovative solutions.
On March 5, 2023, Rollkit announced that it became the first framework to support Bitcoin sovereign Rollups. Builders of sovereign Rollups can publish availability data on Bitcoin using Rollkit.
Inspired by Ordinals, Rollkit utilizes Taproot transactions to publish data. A Taproot transaction that conforms to the public mempool standard can contain up to 390KB of data, while a non-standard Taproot transaction directly published by miners can contain nearly 4MB of arbitrary data.
Rollkit essentially provides an interface for reading and writing data on Bitcoin, offering middleware services that turn Bitcoin into a DA layer.
The idea of sovereign Rollup has faced significant skepticism. Many critics claim that Bitcoin-based sovereign Rollups merely use Bitcoin as a bulletin board and cannot inherit Bitcoin's security. In fact, if only transaction data is submitted to Bitcoin, it only improves liveness - ensuring that all users can access and verify the relevant data through Bitcoin. However, security can only be defined by the sovereign Rollup itself and cannot be inherited. Additionally, the block space on Bitcoin is extremely valuable, and submitting full transaction data may not be a good decision.
Although many Bitcoin layer-2 projects claim to be ZK Rollups, they are essentially closer to OP Rollups, involving Validity Proof technology. However, Bitcoin's programming capabilities are currently insufficient to support direct Validity Proof verification.
The current Bitcoin opcode set is very limited, even unable to directly compute multiplication, and verifying Validity Proof requires an expansion of opcodes, largely depending on the implementation of recursive contracts. The community is actively discussing options including OP_CAT, OP_CHECKSIG, OP_TXHASH, etc. Ideally, adding an OP_VERIFY_ZKP might solve the issue without any other modifications, but this is highly unlikely. Additionally, stack size limitations also hinder efforts to verify Validity Proofs within Bitcoin scripts, with many explorations ongoing.
So how does Validity Proof work? Most projects publish the statediff and Validity Proof of batch transactions in an inscribe format to Bitcoin and use BitVM for optimistic verification. In this scheme, the bridge's Operator acts as a federation, managing user deposits. Before a user makes a deposit, the federation pre-signs the UTXO to ensure that the deposit can only be legally claimed by an Operator. After obtaining the pre-signature, the BTC is locked into an N/N multisig Taproot address.
When a user requests a withdrawal, the Rollup sends the withdrawal Root with the Validity Proof to the Bitcoin chain. The Operator initially pays out of pocket to meet the user's withdrawal needs, and later, the BitVM contract verifies the validity. If every Operator considers the proof valid, they reimburse the Operator through a multisig; if anyone believes there is fraudulent activity, a challenge process ensues, and the wrong party is slashed.
This process is essentially identical to an OP Rollup, where the trust assumption is 1/N - as long as one verifier is honest, the protocol is secure. As for Validity Proof, it is not intended to make verification easier for the Bitcoin network, but rather to facilitate easier verification by individual nodes.
However, the technical implementation of this solution may face challenges. In Ethereum's OP Rollup projects, Arbitrum has undergone years of development, and its Fraud Proof is still permissioned node submission; Optimism still does not support Fraud Proof, indicating the difficulty of implementation.
With the support of Bitcoin Covenants, the operation of pre-signatures in the BitVM bridge could be more efficiently executed, still awaiting community consensus.
From a security attribute perspective, by submitting the Rollup block hash to Bitcoin, it gains resistance to reorganization and double-spending, and the optimistic bridge brings a 1/N security assumption. The bridge's resistance to censorship could also see further improvements.
As we examine the various Layer 2 solutions, it becomes clear that each solution has its limitations. The effectiveness of Layer 2 largely depends on the capabilities of Layer 1 — that is, Bitcoin, under specific trust assumptions.
Without the SegWit upgrade and time locks, the Lightning Network could not have been successfully built; without the Taproot upgrade, the commitments in RGB could not be efficiently submitted; without OP_CAT and other Covenants, Validity Rollups on Bitcoin would not be feasible...
Many Bitcoin maximalists believe that Bitcoin should never change, should not add new features, and that all deficiencies should be addressed by Layer 2 solutions. However, this is unachievable; Layer 2 is not a silver bullet. We need a more powerful Layer 1 to build a more secure, efficient, and scalable Layer 2.
In our next piece, we will explore attempts to enhance programmability on Bitcoin. Stay tuned.