Originally published on March 8th, 2023.
Special thanks to those who reviewed the article and offered feedback: Sreeram Kannan (@sreeramkannan) Toghrul Maharramov (@toghrulmaharram) Dubbelosix (@dubbel06) Polynya (@apolynya) Hasu (@hasufl) Ryan Berckmans (@ryanberckmans)
Before we discuss light nodes, let’s put them into context with an overview of blockchain nodes more broadly. Typically, blockchain nodes are thought about as:
Validator nodes
Full nodes
Light nodes (AKA light clients)
Technically, there are other types (e.g. archival nodes), but these are the main three.
I’ll borrow definitions from Chainlinkgod’s awesome article on blockchain trust models and network participants. In the article he references “block producers'' to include both Proof-of-Work (PoW) miners and Proof-of-Stake (PoS) validators, but we’re going to focus on PoS-land and therefore validators.
Validator nodes are “entities responsible for ordering and packaging transactions into discrete data structures called blocks which are then proposed to the network to validate. If two valid blocks are produced at the same block height, [validators] are responsible for determining which version of the chain is canonical.”
What about PBS? PBS is a planned Ethereum upgrade that will split validators into two separate roles - one role (builders) will order and package transactions into blocks and another role (proposers) will propose and vote on blocks. It will likely take several years to implement, so for simplicity I’ll leave it out of this article.
Full nodes “download and self-verify each block proposed by [validators]. If the block is found to be valid (i.e. protocol rules have been followed), then the block is added to the full node’s personal copy of the ledger and state changes are applied. Any invalid blocks not in line with the protocol’s rules are ignored and consequently discarded without any state changes occurring.”
Light nodes are “a limited form of a full node where only the headers (i.e. small unique cryptographic fingerprints) of blocks are downloaded. Light clients [AKA light nodes] can verify if a transaction was included within a block, but because they do not download or execute all transactions within blocks, they implicitly trust that the majority of block producers are honest.”
I find these definitions fantastic, but let’s go a step further towards ELI5.
Basically, validators create and propose blocks (list of ordered transactions) when it’s their turn and vote on blocks when it’s not their turn. They vote on blocks that follow the protocol rules, since blocks that do not follow the rules are ignored. Validators know which blocks to ignore because they are also full nodes and verify transactions themselves.
Validators need to be full nodes, but full nodes don’t need to be validators. If your computer meets the minimum requirements, you can gain the benefits of a full node. You don’t need to stake 32 ETH.
Why is thinking about full nodes separately from validators important?
For that, I’ll refer you to this post by Dankrad, an OG Ethereum researcher. He explains how full nodes keep validators in check, “a bit like the separation of powers in liberal democracies”.
So where do light nodes come in?
By now it’s clear that most users will not run full nodes. It shouldn’t be all that surprising, since there’s meaningful friction involved:
Buy a new computer
Dedicate computer to being a full node
Configure the software
Maintain the software
Buy new hardware every so often
If there’s anything we’ve learned from the social / mobile / cloud wave, it’s that end users don’t like friction.
So, the goal of light nodes is to have our cake and eat it too. To offer users security guarantees without making them actually run full nodes.
Light nodes use cryptographic techniques to lower the cost of verifying blockchain information. The vision is for light nodes to offer ~ full node security, get embedded into mobile phones and run in the background as a user default (no friction!).
While the vision is exciting, we’re not quite there yet. Let’s walk through different types of light nodes and their current status.
The industry hasn’t reached clear consensus on light node nomenclature, so we’ll make up some terms based on CT discourse.
Light nodes can be classified based on the information they verify:
Consensus verifiers
State verifiers
Data availability (DA) verifiers
Full verifiers
Consensus verifiers can check that a transaction is included in the canonical chain. They follow the canonical chain by tracking validator attestations and check transactions by verifying merkle proofs.
This is similar to how IBC works. Cosmos chains basically run consensus verifiers on chain and verify the consensus of connected chains. Generally speaking, light nodes today are consensus verifiers.
Ethereum light nodes aren’t true consensus verifiers. However, the problem is being solved a different way so Ethereum should effectively have consensus verifiers within a year.
Consensus verifiers are useful, but there's still risk. You are still trusting the validators. What if the validators include an invalid transaction or state transition?
Here’s where state verifiers come in. State verifiers can check that state transitions are valid without processing transactions.
This can be accomplished using either validity (AKA ZK) or fraud proofs. Light nodes verify proofs that they download alongside block headers to ascertain whether the new state is correct (for validity proofs) or incorrect (for fraud proofs).
There are many resources to learn how proofs work, so I won’t get into it here. The one thing I will point out is that light nodes depend on honest full nodes to tell them fraud occurred and to not accept new state, whereas validators can use validity proofs to directly convince light nodes that blocks are valid and to accept new state.
State verifiers are powerful (and mobile friendly!), but the problem is not yet completely solved.
Validators can still withhold publishing data, so nodes might know that computations were done correctly but not be able to fetch information about new blockchain state from full nodes. There are more complex attacks regarding data, but this case makes clear that the blockchain can’t progress.
DA verifiers are the final piece of the puzzle. They can check that inputs to the computation are stored somewhere the network can download from if needed, without actually downloading all the data.
How is this possible? Using a technique called data availability sampling. Basically, you can be very confident that all the block data is available somewhere if a bunch of light nodes are checking small chunks of it. More nodes sampling = more security.
This topic is complicated and not as familiar to many, so here’s a comprehensive breakdown by Vitalik and a nice mental model by Nick White.
Also mobile friendly!
What if a light node can do all the above?
I’ll refer to these nodes as full verifiers, but it’s much more important to point out that they are trust-minimized. They offer almost the same trustless properties of full nodes without having to download blocks. Magical stuff.
Given the reality that most people won’t run full nodes, trust-minimized light nodes are the practical vision for protecting all blockchain users from bad actors.
Shoutout to the teams pushing this vision forward and putting cryptographic armor in our pockets 🫡