Bitcoin is the earliest, most secure, most decentralized, and highest-valued blockchain in the world. However, its low transactions per second (TPS) and limited programming capabilities have often been criticized, making it difficult to support large-scale applications and significantly hindering the development of the Bitcoin ecosystem. As a builder in the Bitcoin ecosystem, Chakra will guide us through the past, present, and future of Bitcoin scaling solutions.
This article is the first in the Bitcoin Scalability series, primarily introducing native scaling solutions historically implemented on the Bitcoin mainnet.
In 2010, Satoshi Nakamoto introduced a 1MB block size limit in bitcoin-core. Over a decade later, this explicit limit has not been modified.
Interestingly, Satoshi did not publicly explain why he proposed the block size limit, and this limit was "hidden" in the code merge's PR without detailed explanation. Several years after Satoshi left, the community became deeply divided over the block size limit, and the demand for larger blocks sparked widespread discussion.
Larger blocks mean more transactions can be accommodated. Assuming the consensus time remains unchanged, larger blocks mean higher TPS.
Why is TPS so important? Because with the 1MB block limit, given the transaction size at that time, the number of transactions per second could only be 3-7, which is insufficient for large-scale applications and fails to realize Bitcoin's vision of a "peer-to-peer electronic cash system."
However, larger blocks also bring different levels of problems.
Firstly, larger blocks demand higher hardware requirements in terms of storage, computation, and bandwidth, resulting in increased operational costs for full nodes. Bitcoin's history would expand rapidly, requiring new full nodes to spend more time syncing with the network. These demands decrease users' willingness to operate full nodes, which in turn reduces the degree of decentralization.
Secondly, larger blocks increase the synchronization time between nodes, raise the likelihood of orphan blocks, and lead to more frequent block reorganizations and increased risk of forks, significantly reducing security.
Later, this issue was described by Vitalik as the blockchain trilemma, meaning a blockchain cannot simultaneously achieve decentralization, scalability, and security. Larger blocks enhance scalability but at the expense of weakening decentralization and security.
Most importantly, modifying the block size limit requires a hard fork, which necessitates all nodes on the network to upgrade at the same time, otherwise it would lead to a network split. This is not a favorable option for Bitcoin, which relies on decentralized consensus. Influenced by Satoshi Nakamoto, avoiding hard forks seems to have become a de facto principle of Bitcoin.
Unfortunately, splits did occur. Despite the lack of consensus within the community, some miners and developers changed the block size limit in their clients, ultimately leading to network forks. Bitcoin Classic in 2016 adopted BIP 109 to fork the block size limit to 2MB; the Bitcoin XT client in 2016 adopted BIP 101, raising the block size to 8MB. However, the vast majority of miners and users stayed on what we now know as the Bitcoin Mainnet.
Efforts to explicitly increase the block size through hard forks failed.
If hard forks are unacceptable, could soft forks be a solution? SegWit is one such approach.
A witness is a credential for unlocking a UTXO, and for a long time, witnesses were placed in the input script field of UTXOs to complete transactions. However, this method led to potential issues such as circular dependencies, third-party transaction malleability, and second-party transaction malleability.
As early as 2011, developers noticed this problem and proposed the solution of Segregated Witness (SegWit), which separates the witness from other transaction data. However, the hard fork proposals at the time did not gain support, and it was not until the proposal of a SegWit soft fork in 2015 that final merging was accomplished.
How does SegWit achieve backward compatibility through a soft fork? This mainly includes the following two aspects:
New version nodes can recognize and accept blocks and transactions produced by old version nodes.
Although old version nodes cannot recognize new rules and features introduced by new versions, they still treat blocks produced by new versions as valid.
The SegWit soft fork allows new transactions to use empty input scripts and adds a Witness field to the block structure to store the witness. Since empty input scripts were supported by pre-upgrade Bitcoin core, old version nodes do not reject blocks produced by new versions. Also, by using the version field, old transaction types can still be used, and nodes handle them differently based on the version.
The scaling in SegWit is implemented in the form of weights, with a weight of 1 for witness bytes and a weight of 4 for other data bytes, limiting the maximum weight of each block to 4 million. Why assign different weights to different types of data? A common-sense idea is that witness data only serves a verification purpose when used and does not need to be preserved in storage long-term, thus it incurs relatively lower costs and is assigned a lower weight.
This effectively acts as a disguised increase in the block size limit, with the theoretical block size limit raised to 4MB (entirely due to Witness data), and on average, blocks can reach around 2MB. From the perspective of the old block structure, this still adheres to Satoshi Nakamoto's original limit of not exceeding 1MB per block.
Using Bitcoin’s opcodes like OP_IF, we can set complex conditions for Bitcoin's spending scripts, such as time locks, multisigning, etc. However, complex spending conditions often require multiple inputs and signatures for verification, increasing block payload and reducing transaction speed, while also exposing all payment conditions, leading to privacy leaks.
Taproot uses MAST to enhance Bitcoin, where users represent spending conditions with a Merkle Trie. Each leaf node represents a spending script, and during spending, only the script actually being executed and the corresponding Merkle Path need to be provided, without revealing the other conditions. This results in smaller block space consumption and improved privacy.
The Taproot upgrade also introduces Schnorr signatures, which possess additive homomorphic properties, allowing for signature aggregation and batch verification, thereby improving the overall transactions per second (TPS) rate. The advantage of aggregated signatures with Schnorr signatures significantly simplifies the logic for verifying multisignature transactions. Previously, ECDSA signatures required sending multiple signatures to the chain to match with scripts, whereas Schnorr signatures only need to send a single, off-chain aggregated signature to the chain, reducing the use of chain space for multisignature payments.
By combining Schnorr signatures with MAST and using the Pay to Contract (P2C) concept, complex contract code is committed through the MAST root to adjust and produce a standard Bitcoin public key that supports single Schnorr signature payments.
Interestingly, because individual and multiple signatures appear the same on-chain with Schnorr signatures, the logic of complex scripts, multisigning, and single signing cannot be distinguished on-chain, further enhancing privacy.
Bitcoin's scalability solutions reflect its evolving approach to maintaining decentralization and security while enhancing performance.
Initially, increasing block size was considered, directly addressing low transaction rates but raising issues related to node costs and network forking, challenging community consensus.
The introduction of SegWit marked a significant advancement, optimizing block capacity via a soft fork, ensuring backward compatibility and avoiding divisive hard forks.
Subsequently, Taproot further refined scalability and privacy through MAST and Schnorr signatures, reducing transaction space and enhancing validation efficiency. More importantly, Taproot enables complex script programming on Bitcoin, which paves the way for future scaling attempts.
These developments underscore Bitcoin's cautious yet innovative progression towards a more scalable and robust network, crucial for its future as a global payment system. Everything achieved by Chakra is built upon these innovation.
However, the impact of these scaling solutions is not yet sufficient to realize the vision of a "peer-to-peer electronic cash system." In our next blog, we will discuss off-chain scaling solutions with a higher degree of scalability. Stay tuned.