Ethereum has the highest validator count compared to any other blockchain.
(Note: above is outdated, this is from last year and no longer maintained, its to show a comparison that always bugged me, the leader board would look similar today but the actual validator count is below)
However, these metrics can be misleading; the highest validator count does not imply the most decentralized network. There are other factors to consider such as geographic distribution and client diversity. There’s an excellent post going into factors to consider when evaluating validator decentralization, here.
What I'm focusing on, and believe is the most important factor to measure network decentralization, is the number of independent stakers (or solo stakers).
A solo staker is a term describing independent node operators (NO), in full possession of staking keys, responsible for setup/maintenance and operation of a validator
Finding how many solo stakers is not straightforward to measure, I have recently updated a repository of mine, which attempts to identify addresses belonging to solo stakers, this article will go into more detail on findings and uses data from around 17/09/2023.
In the end, Ethereum is differentiated from other chains by optimising for decentralization and pushing an ethos of being able to stake from home, with an impressive number of independent node operators that pales other networks.
Ethereum validator count is impressive at first glance, but this is a vanity metric. Multiple validators can be run by the same operator. What’s important to consider is both staking concentration and the number of independent stakers.
Basically, It would be bad if the number of independent stakers was high but a significant portion of stake controlled by only a few. Likewise, a healthy distribution of stake would not mean much if only controlled by a handful of actors.
Pareto Distribution, often referred to as the 80/20 rule, is seen in various economic scenarios and implies that a small percentage of stakers might control a large portion of the total stake. This kind of concentration can be a concern for decentralization. Yet, Ethereum's strength lies in its permissionless nature. The ability for independent actors to join and participate in consensus without permission ensures a broader distribution of power, acting as a counterbalance to this concentration.
This represents how stake is allocated across different entities.
A ‘validator’ can be used as a measurement for amount staked, as each validator must put a 32 ETH bond. However, any node operator can run multiple validators so it is less an indicator of stakers on the network but more of how much is staked.
LIDO currently controls around +32.3% of the network, another ~30% is owned by the largest exchanges in the space Coinbase, Kraken & Binance.
It's important to note that LIDO comprises 29 separate staking providers, however this is still a permissioned set and thus in its current form presents a centralisation risk to Ethereum.
The ‘unknown’ portion is where independent stakers can be found at around 28.3% of stake. Additionally, Rocket Pool manages 3.56% of stake. As the only decentralized staking pool on Ethereum today (with independent node operators) I include this as ‘solo or independent’ node operators.
At its core, Ethereum is a decentralised network of nodes that communicate via peer-to-peer networking, to keep the distributed ledger in sync and up to date, in addition to providing new nodes that need to download the full blockchain with information about past blocks. The more nodes the more decentralised and robust the network.
Every NO must run their own full node (execution + consensus clients) multiple validators can be connected to a full node. More independent stakers means more nodes, we would expect larger entities to run many validators across one full node, spinning up more nodes to balance larger clusters of validators.
‘Rated.network’ estimates that only around 6.5% of the network are solo stakers. Rated approached the issue by looking at a wide array of metrics related to validator performance thus mainly focusing on validator addresses.
I have taken a different more direct approach looking purely at deposit addresses to identify solo staker owned Ethereum addresses.
the distinction between a validator address and a deposit address. While the validator address is specific to the validator key (each 32 ETH validator has its unique address), the deposit address is a standard Ethereum address that can be used for multiple purposes, including depositing funds for staking.
I started by looking for Unique depositors to the staking contract and what type of address interacts with the deposit contract. I’ve divided addresses into three categories:
1. Ethereum Account- TX historyHigh confidence of solo staker
An Ethereum account with contract interactions, which means any token interaction (initiated by the user) or contracts in DeFi. This is indicative of an Ethereum user experienced in general on chain activity and not an address owned by an institution/exchange.
2. Ethereum Account- No TX historyPotential solo staker (temporary deposit addresses)
Solo stakers may use a temporary deposit address to conceal activity from other wallets. This is better for privacy, however a solo staker cannot be distinguished from a centralised actor following the same pattern.
3. Contract account Not a solo staker
Contract addresses can be ignored completely as they are used by entities/pools/ institutions that manage large amounts of funds and leverage a smart contract account for additional security. it would be extremely unusual for a solo staker to make a deposit from a custom contract address.
I want to briefly touch on Diversified Validator Technology (DVT), though early it’s becoming increasingly more relevant.
This technology enables stakers to join a shared validator (cluster), enabling much more independent node operators.
The goal is to find independent node operators who increase decentralization. Contract address deposits have been excluded to filter entities which makes sense right now, but in future with DVT a lot more contract addresses will represent many home stakers depositing to a shared validator.
Beacon Chain deposit contract:
I took the snapshot of all deposits from Ethereum accounts, (excluding contracts) up to 17/09/23
There were 379,285 deposit transactions from Ethereum accounts (2,157 failed transactions).
On the same date there were a total of 805,541 active validators. (Since the time of writing, another 20k+ or so validators have been activated) leaving about 424,029 deposits from contract addresses.
Left: 47% of deposits to the beacon chain were from normal Ethereum accounts, the rest are from contract addresses.
Right: of all deposit transactions from normal accounts, 14% had a previous history of smart contract/token interactions, the rest being temporary or used only for the deposit.
of 379,285 valid deposits to the beacon chain, 118,312 were Unique Accounts. So, this might seem, that there are 100k or so stakers running a few validators (average 3.2) each, this is not the case:
Most of these accounts are controlled by single large stakers, or centralised entities such as Coinbase, and seems to be responsible for between 79% to 88.6% of these unique addresses despite not being in control of the largest stake (about 8.3% from the earlier chart)
So, Unique accounts are found here under ‘Beacon Chain Depositors’, all unique Ethereum addresses that made a valid deposit to the Beacon Chain.
All unique addresses will include centralised entities that did not deposit via smart contract, the next step was to identify entities and remove them.
Hildoby has entity tags which can be found here:
Most of these entity addresses are from here, this was used in combination with my own and some manually reviewed addresses from other sources (see repo), so entities that were easy to identify can be found here under ‘Entity List’.
As mentioned before deposits originating from contract addresses will include most centralised entities. LIDO which control around 32% of the stake today deposit from contract address:
As contracts can handle batch deposits which would be ideal for large deposits like so:
Of the identified entities Coinbase have the largest footprint, this is due to the way they deposit there is no batching at all which most CEXs do such as Kraken & Binance, Coinbase generate a new address deposit, but can be identified by patterns which I divided into two categories:
Coinbase A: Positive Coinbase Owned
Belongs to Coinbase with near 100% certainty, the change back to ‘Coinbase miscellaneous’ /change address. This is automated behaviour with Coinbase collecting their loose change, 94,008 addresses follow this pattern of the unique deposits.
Coinbase B: Possibly Coinbase Owned
Coinbase being one of the largest sources of liquidity for Ethereum, stakers can use the exchange as a ‘mixing pool’ of sorts to obfuscate the source of funds, withdrawing to a fresh Ethereum account for the deposit to the Beacon Chain.
This set contains potential solo-stakers, other entities or large stakers using Coinbase to obfuscate themselves, or even Coinbase themselves. It’s impossible to distinguish which, it's entirely possible that these are all privacy conscious solo-stakers making one deposit each, but it's much more likely that this is the latter.
10,862 addresses with the following pattern:
It's clear that Coinbase is responsible for most of the unique deposit addresses and the majority of entities deposit via contract address.
After filtering deposits from contracts and the entities that are identified in the ‘entity list’ what's left is the pool of potential solo-stakers.
In Rocket Pool, a Node Operator (NO) runs the same software base and carries out the same tasks and responsibilities as a solo-staker.
Rocket Pool is a staking provider but unlike LIDO its ability to join as a node operator is open to anyone thus it has enabled much more independent stakers and given way to the most decentralised staking derivative rETH.
I won’t expand on the technical differences between these staking providers here (to learn about it see here), but I will expand on its inclusion as it accounts for a significant portion of the independent stakers that I include in this category though not technically a ‘solo-staker’.
To the beacon chain a Rocket Pool validator is no different than any normal validator but the deposit is, as excess ETH is matched (from the pool to rETH minters) to the node operator via smart contract.
This means the smart contract is what makes the deposit- hence its filtered out using my methodology and requires a different approach. For Rocket Pool I looked into Rocket Pool contracts, using Rocketscan API.
3,220 NO accounts, but 3,134 withdrawal addresses, some NO share withdrawal addresses which a small number of operators run multiple NO Accounts.
NO’s generally create one ‘node account’ to manage staking, keys and Rocket Pool Token (RPL) functions for all minipools, it's possible to use for other activities- (some do this), it's ill-advised and can increase security risks. Withdrawal addresses give a better indication of how many individual actors.
3,134 unique withdrawal addresses belonging to Rocket Pool NOs.
At the Merge in 2022 this was 1,461, a 114.5% increase over 1 year of independent stakers, impressive during a bear market.
This is in part due to the Atlas upgrade and lower ETH bonded (LEB) minipools; allowing only 8 ETH nodes and 24 rETH minted, you can see an increase in node operators after the upgrade.
174 addresses are also Beacon chain depositors, so about 5.5% of identifiable Rocket Pool NOs are also running solo validators.
Seems to suggest that Rocket Pool opened up opportunity for lower ETH holders being the core target, although a few solo stakers are also attracted by the slightly boosted yield (but must be comfortable with RPL risk).
There are 28,888 Minipools
A minipool is a validator where the 32 ETH is split between the Rocket Pool NO own stake and the ETH from rETH minters, to the beacon chain it's no different than a 32ETH validator.
At the merge this was 6,932 a 316% increase over 1 year, again you can see the noticeable increase after Atlas.
At 28,888 minipools (validators) accounts for 3.56% of total ETH staked, it wouldn’t be great if this was all a few node operators, so staking distribution is also important.
This looks like 1 ⁄ 3 of NO are running 1 minipool which might suggest that Rocket Pool has opened up ETH staking to a large number of less than 32ETH holders, but most run more than 1 minipool.
So, Rocket Pool Node Operators are individuals running validator software, while minipools represent the actual validators. This system amplifies decentralization for Ethereum: while operating in a pooled framework, these NOs maintain the spirit of solo staking.
My goal is to identify independent stakers on the Ethereum network, hence Rocket Pool node operators are recognized as a significant group, and terms "Solo staker" and "independent node operator" are used interchangeably.
With unique addresses and identified entities filtered out, what should remain are solo stakers, but it's not so simple as the challenge lies in distinguishing between individual stakers and entities that might control multiple addresses. To narrow this down I categorized solo stakers into two lists:
This list contains 17,065 Ethereum addresses that have made valid deposits to the Beacon Chain. I removed addresses associated with known, easily identifiable entities. However, it's worth noting that smaller pools or institutions that aren't easily identified might still be present in this list. Some stakers use single-use deposit addresses for privacy, so multiple addresses might belong to a single entity.
This list is more specific. It contains addresses that have not only made valid Beacon Chain deposits but have also interacted with smart contracts. I identified 9,420 addresses with a history of contract interactions or token transfers. These interactions suggest that the addresses belong to Ethereum network users and not institutions or exchanges.
NOTE: both included the Rocket Pool withdrawal addresses.
Amount staked per solo node operator, excluding Rocket Pool NOs. For a clearer representation of stake distribution, I focused on the
Solo-Staker-B list only. These addresses exhibit no behavior suggesting the use of temporary addresses to obfuscate.
Solo-Staker-A list contains potential solo stakers, but its inclusion of multiple temporary deposit addresses, possibly owned by a single entity, can muddy the distribution picture.
So including Rocket Pool Node operators, I get about 4.39% of ETH staked under control of independent stakers at minimum, I believe this number is a little higher as I am only counting high confidence of solo stakers, with the upper bound a lot higher.
Although Ethereum is a public blockchain, it doesn't hold user-identifying information, so it's difficult to determine the true number of solo stakers.
Using my research into identifying unique addresses from ‘solo validator addresses’ at various milestones in the Beacon chain history. I believe the real number of solo-stakers falls somewhere between A+B below:
We have a set of addresses that deposited (in many cases more than once) with rich transaction history indicative of individual Ethereum users, this B set is the lower bound or a floor and likely closer to the true number.
Set A the upper bound, identified ‘Entities’ removed and all potential individual actors left. I would argue that this is a cap on the total number of solo-stakers as there is no way to determine how many of these are individual actors vs single actors using multiple deposit addresses.
NOTE: 10k addresses in Coinbase B, would also contain potential solo-stakers making the upper bound even higher.