Ethereum peer-to-peer network Eclipse attack

Zer0Luck

0x9b39

January 25th, 2023

✅ Follow me: X: @Younsle1 Warpcast: @zer0luck

Goal

Understand the Ethereum P2P network structure and explain possible vulnerabilities in the network layer.

Covers node operation and detailed components based on geth v1.8.0. Study the Eclipse attack with the above.

I reviewed this paper. Low-Resource Eclipse Attacks on Ethereum’s Peer-to-Peer Network

Overview

Possibility of Eclipse Attack on Ethereum nodes abusing the p2p network used for neighbor discovery
This attack can be launched using only two hosts, each with a single IP address
Eclipse Attacker monopolizes all incoming and outgoing connections of the victim, isolating the victim from the rest of the peers on the network.
The attacker can then filter the victim's view of the blockchain or select the victim's computing power as part of a more sophisticated attack.
Ethereum adopting Kademlia P2P Protocol is likely to be vulnerable to Eclipse attack

Bitcoin and Ethereum node

Both Bitcoin and Ethereum use the proof-of-work algorithm to reach consensus, but Bitcoin's consensus algorithm uses the Simple longest-chain rule.
The consensus algorithm of Ethereum operates based on the more sophisticated GHOST Protocol.
Both networks support smart contracts, but Bitcoin smart contracts must be written in a very restrictive assembly-like language.
The language of Ethereum is a state Turing machine.
While both Bitcoin and Ethereum use peer-to-peer networks to communicate the state of the blockchain, Bitcoin's network emulates an unstructured random graph.
Ethereum uses Kademlia DHT.

P2P Network Security Issues

Despite increasing research into the security of Ethereum's consensus algorithm and scripting language, most of the properties of the Ethereum peer-to-peer network are under study.
Made it clear that the security properties of a proof-of-work blockchain will depend on the underlying peer-to-peer network security.
If a peer-to-peer network splits and different nodes see different views of a blockchain, how can these nodes come to a consensus on what a blockchain actually is?

Eclipse Attack of P2P Network (easy)

In an Eclipse attack, the attacker fully controls the victim's access to information, allowing them to filter the victim's view of the blockchain or pick up the victim's computing power as part of a more sophisticated attack.
Implement an Eclipse attack where the attacker monopolizes all incoming and outgoing connections from the victim, isolating the victim from the rest of their peers on the network.

Why Ethereum is Vulnerable to Eclipse Attacks ?

Ethereum's peer-to-peer network seems more resilient to Eclipse attacks than Bitcoin.
Bitcoin nodes only make 8 outgoing connections by default, whereas Ethereum nodes by default make 13 outgoing connections.
While Ethereum's P2P network has cryptographically authenticated messages, Bitcoin nodes do not authenticate P2P network operation messages.
This means that the Bitcoin peer-to-peer network is vulnerable to man-in-the-middle integrity attacks and to BGP, the Internet's routing protocol.
An Eclipse attacker only needs to take control of two machines, each with a single IP address. The attack is off-path
The attacker only controls the end-host and does not occupy a privileged position between the victim and the rest of the Ethereum network.
Does not occupy a privileged position between the victim and the rest of the Ethereum network.
In contrast, the most well-known off-path Eclipse attack against Bitcoin requires an attacker to take control of hundreds of host systems, each with a unique IP address.
Hundreds (thousands) of IP addresses for most Internet users While the Bitcoin Eclipse attackers it envisions are full-fledged botnets or Internet service providers, the BGP hijacker Bitcoin Eclipse attackers require access to core Internet routers using BGP. That's why.

major issues (root cause)

Nodes on the Ethereum network are identified by cryptographic EDSA public keys
In Ethereum versions prior to geth v1.8, an unlimited number of Ethereum nodes with different ECDSA public keys can be run on the same single system with the same single IP address.
Since generating a new ECDSA public key is simple (one person only needs to run the ECDSA key generation algorithm), an attacker can generate thousands of Etheruem Node IDs in seconds without consuming significant computing resources.
An attacker creates a series of Etheruem Node IDs and then launches an Eclipse attack on two host systems (each with a single IP address).
Ethereum nodes form connections to peers in a biased way that attackers can easily predict (i.e., some node IDs are more likely to be peers than others).
Therefore, the attacker chooses his node ID carefully so that the victim is more likely to connect to the attacker's node ID rather than a legitimate node ID.

Ethereum P2P network protocol Kademlia DHT-based inheritance (relationship)

It is based on the Ethereum P2P network protocol Kademlia DHT.
Each item of content is stored on a small subset of peers in the network.
- However, the design goals of the two are fundamentally different. Kademlia provides an efficient means of storing and retrieving content in a decentralized peer-to-peer network.
Kademlia allows each content item to be retrieved by querying no more than the number of log nodes in the network.
In contrast, in the Ethereum protocol there is only one item of content that every node seeks to discover: the Ethereum blockchain.
The entire Ethereum blockchain is stored on each Ethereum node.
It is not required for content search on Ethereum's P2P network.
Used only to discover new peers.
This is Ethereum designed by Kademlia

Two Off-Path Attack

Eclipse Attack, Time Manipulation Attack

Eclipse by connection monopolization

connection monopolization We take advantage of the fact that network connections to Ethereum clients are all incoming, i.e. they can be initiated by other nodes. Thus, the attacker waits for the victim to reboot (by sending a packet-of-death to the Ethereum client or host OS, intentionally forcing the victim to reboot), then immediately starts listening. A connection from each attacker node to the victim
If all connection slots are occupied by the attacker, the victim is in an eclipse state

Eclipse by owning the table

A connection monopolization attack can be trivially eliminated by forcing Ethereum clients to make incoming (initiated by other nodes) and outgoing (client-initiated) connections.
It shows that even if these countermeasures are adopted, Ethereum is still vulnerable to low-resource Eclipse attacks. To do this, we present an Eclipse attacker who repeatedly pings the victim using a carefully crafted set of node identifiers.
If the victim restarts, the victim can form all 13 outgoing connections with the attacker with a high probability.
To complete the Eclipse attack, the attacker monopolizes the remaining connection slots.
There is an unsolicited incoming connection from the attacker node identifier.

Kademlia-style network formality guarantees a low maximum hop distance for all peers in the network and its groups. However, Ethereum rarely needs any node a to “find” a specific other node b, so this style is unnecessary. it's time to check This happens when a user of a node is a statically configured node and the corresponding IP address of the node's P2P network or node ID is missing. There are rumors of “sharding,” where each Ethereum node only needs to store a small portion of the blockchain. (P2P Level, Broadcast Model is not scalable because every node has to download and rebroadcast O(n) data (every transaction sent). On the other hand, our decentralization criterion assumes that

Attack by manipulating time

If the local clock is more than 20 seconds faster than other nodes on the Ethereum network, the node is also capable of Eclipse attacks.
Such an attack is carried out, for example, through manipulation. Network Time Protocol (NTP) used by hosts running Ethereum nodes.

Two Off-Path Attack Mitigation

As a countermeasure that can be used to prevent such attacks, the most important recommendation is that Ethereum stop using the ECDSA public key as a uniform node identifier.
Instead, you must use a combination of IP address and public key.
In addition to this, we need to find ways to harden Ethereum through design decisions that are different from those that have traditionally been part of the Kademlia protocol.
2018.02.14 Released gethv1.8 Several measures have been adopted.

Effect of Eclipse Attack

Attacks on consensus

The Eclipse attack can be used to attack the blockchain's consensus algorithm by selecting the victim's mining power.
Eclipse attacks are part of an optimal adversarial strategy for double spending, selfish mining

Attacks on blockchain layer-two protocols

In blockchain layer 2 protocols (Bitcoin's lightning network, Ethereum's radian network, ...), a pair of users posts a blockchain transaction to establish a payment channel between two users.
The following users can pay each other using unpublished transactions on the blockchain.
These off-chain blockchain payments are fast because there are no related performance bottlenecks.
Finally, it closes the payment channel by posting a blockchain transaction reflecting the new coin balance between the two users.
Importantly, the security of these protocols requires that no off-chain payments are sent after the payment channel is closed. Thus, Eclipse attackers can trick victims into thinking that the payment channel is still open.
Eclipse attack on the network is the same even if the unprocessed part recognizes that the payment channel is closed.
For example, if the victim is a merchant who releases an item in exchange for an off-chain payment, the attacker can once again obtain the item without paying.

Attacks on Smart Contract

Ethereum's smart contracts have several unique properties.
For example, smart contracts can contain variables with pending state that can be changed by transactions posted to the Ethereum blockchain.
Ethereum smart contracts can be attacked if users see inconsistent views on the blockchain.

Let me give you a simple example

Contract: Digital Cat Auction 

Logic :
	1. Digital Cat bidding count (state variable : x)
	2. Alice bidding =(Ether)=> Digital Cat iff x < 5 =(TX)=> Block

Attacker : 
	1. x < 5 view =(alice:signed T)=>alice
	2. T => non-eclipse
	3. x > 5 => Tx Verfication

Ethereum P2P Network

Details about Ethereum's peer-to-peer network based on the geth client
for geth's main neighbor discovery protocol known as RLPx Node Discovery Protocol V4

Kademlia Similarities and Differences

The Ethereum peer-to-peer network is based on the Kademlia DHT, but with a significantly different purpose.
Kademlia is designed as an efficient means of storing and retrieving content in a decentralized peer-to-peer network.
Ethereum's peer-to-peer network is only used to discover new peers.
In the Kademlia network, each item of content is associated with a key (b-bit value) and is stored only by peers with a “close” NodeID (b-bit value) to the associated key.
Kademlia's concept of "close" is given by the XOR metric, so the distance between b bit strings t,t’ is a bitwise exclusive, (XOR) interpreted as an integer. provided by

Each Kademlia node has a data structure consisting of b unique buckets, where buckets i stores network information about pear k at distance i. (NodeID Per ⇒ XOR metric)
Ethernodes.org As of 2017.01.04, 70% of Ethereum nodes run geth and Kademlia nodes find the “closest” Node ID in key t.
- return the content associated with t, or
- ask t to “close” return some nodeID
This lookup process is repeated until the key is found, and the number of nodes queried in the lookup is the logarithm of the number of nodes in the network. (Kademlia Protocol design core properties)
Ethereum uses the same XOR metric and the same buckets data structure
Ethereum nodes do not need to identify peers storing target content items. (Because there are only “content items” (the Ethereum blockchain) that all peers store.)
As such, referrals are primarily used to discover new peers. (There is a small and rare exception when resolving NodeID to IP address.)
To do this, the Ethereum node selects a random target t and finds k = 16 in buckets.
Ask each node ID closest to target t, return k NodeIDs from “close” buckets to target t, so that at most $k * k'$ newly discovered NodeIDs, the k nodes closest to target t are the k nodes closer to t request to return
This process continues repeatedly until no new nodes are found. That said, Ethereum lookups are mostly a nifty way to fill buckets with randomly selected NodeIDs.

Ethereum P2P Network Components

NodeID

Ethereum network peers are identified by NodeID.

NodeID b = 512bit (64 byte) ECDSA Public-key

Multiple Ethereum nodes, each with a different NodeID, can run on a single system with a single IP address.
Generating an ECDSA key is easy. All you have to do is run the ECDSA key generation algorithm.
There is no mechanism to ensure that a unique NodeID corresponds to a unique network address.
So you can run an unlimited number of nodes on the same computer using the same IP address. (Main attack vectors)

Network Connection

UDP Connection is used only to exchange information about the P2P network.
There is no limit on the number of UDP connections, except that up to 16 UDP connections can be made simultaneously.
A ping message requests a pong message as a response. This pair of messages is used to determine whether a neighboring node responds or not.
The findnode message requests a neighbor message containing a list of 16 nodes seen by the responding node. (A node only responds to findnode requests if the responding node already exists in its own DB.)
All UDP messages are timestamped and cryptographically authenticated according to the sender's ECDSA key (aka sender's NodeID).
To limit replay attacks, the client drops UDP messages whose timestamps are more than 20 seconds older than the client's local time.
To prevent Pong messages from being sent from spoofed IP addresses, the hash of the Ping that responds to the Pong is also included.
TCP Connection
- All blockchain information is exchanged over an encrypted and authenticated TCP connection.
- The total number of TCP connections at a given time is maxpeers, which is set to 25 by default.
- To perform an eclipse attack on a node, an attacker must continuously occupy all maximum peers of the target TCP connection.
- A TCP connection goes out when initiated by the client (i.e., when the client sends a TCP SYN packet).
- If not, it comes in.
- A client can initiate outgoing TCP connections with other nodes up to $[1/2(1 + maxpeers)]$ (default: 13).
- In contrast, prior to geth v1.8.0 there is no limit on the number of incoming unsolicited TCP connections other than maxpeers.
- This means that a client can have all max peers of a TCP connection as an unsolicited incoming connection, which is exploited in brute force eclipse attacks.

Storing network information

The client stores information about other nodes in two data structures.
The first is a long-term database called db that is stored on disk and persists across client reboots.
The second is a short-lived database called tables containing buckets like Kademlia that are always empty when the client reboots.

A db is used for long-term storage of network information, whereas a table is used to select peers. (outgoing TCP connection)
The database is stored on disk and contains information about each node seen by the client. (Check the node if it responds with a valid pong response to the ping message sent by the client.) There is no limit on the size of the database.
as each database item
- NodeID
- IP address;
- TCP Port, UDP Port
- last sent Node ping png time
- findnode msg noresponse count
The lifetime of a node is the elapsed time since the last pong time received from the node.
Every hour (starting from the first successful bonding time)
The client runs a purge process that removes nodes from the DB that are older than 1 day.

table

The table is always empty when the client reboots.
The table consists of 256 buckets, each bucket can hold up to k = 16 entries.
Each item records information about other Ethereum nodes, especially the corresponding NodeID, IP address, TCP Port, and UDP Port.
Items in each bucket are sorted in the order in which they occur
As it is added to the buckets, when the client discovers a new node that maps to an already full bucket, the client pings the last node in the buckets (i.e. the oldest node).
Prior to geth v1.8.0, if this old node fails to respond to the pong, a new node is added to buckets and the old node is pushed. Otherwise, new nodes are not added to buckets.
Because ping messages are sent over udp, clients can still send messages even if all available TCP connections are in use.
If the client does not respond to the findnode request more than 4 times, the node is also removed from the table.
All of the above is similar to how buckets are maintained in Kademlia Protocol.

Fill the data structure

How to fill data in database, table ?

bootstrap node

When the client first boots, it has an empty database and only knows about 6 hardcoded bootstrap nodes.

Bonding

The bonding process is used to populate both the db and table as follows: When a client joins a node, consider the following:
- The client first checks whether the node exists in its database.
- the database logs 0 failed responses to findnode requests
- The database determines within 24 hours whether the node responded by ping-pong within the node
The client immediately tries to add the node to the table (the node is actually added to the table only if there is storage space), otherwise the client pings the node.
If the node responds with Pong, bonding is successful
If bonding succeeds, the client tries to add and update the node entry in its database and add the node to the table.

Unsolicited pings

The client receives an unsolicited ping from another node, the client responds with a pong, then joins to the node

Lookup

The client searches for nodes using the loop lookup(t) method (similar to Kademlia's lookup method)
The lookup(t) method relies on the following notion of “close” to object t. where t is a 256bit string
Let a and b be two nodeIDs.

Here, oplus math is XOR. If $d_A < d_B$ then a “closests” to t
Otherwise b is “closests” to t (Kademlia XOR Matrix) The lookup function uses this proximity concept to search nodes as follows:
- First, select the 16 nodes closest to t in the table
- The client queries each of these 16 nodes using the findnode message.
- The findnode message contains destination t.
- Upon receiving the findnode message from the client, the other node identifies the “closest” 16 nodes in t in its table data structure and returns the 16 nodes to the client as a neighbor message.
- The client fetches information about up to 16 new nodes from each of the 16 nodes queried.
The client now has information about up to 16 * 16 = 256 new nodes and connects to each of these 256 new nodes.
Finally, the client identifies the 16 nodes closest to the target t from the set of 16 nodes queried with the findnode message and the set of new nodes that the client has successfully joined (up to 256).
Then repeat the process with these 16 closest nodes. (query each node not queried in the previous iteration with a findnode message containing the target t, learn from each node for up to 16 new nodes, then identify the next 16
the node closest to t in the new set of 16 * 16 + 16 nodes. This continues iteratively until the set of 16 closest nodes is stable. (It does not change after repetition.)
The 16 closest nodes are added to the FIFO queue in the lookup_buffer data structure.
If a node fails to respond to a client's findnode request 5 times in a row (as recorded in the db), lookup removes that node from the client table.

seeding

Ethereum client seed process exists (geth v1.8.0 or earlier version seed trigger direction)
- When a node reboots
- every hour
- When lookup() is called on an empty table
The seeding process first checks if the table is not empty, if so nothing happens but conversely if it is empty it joins to each of the 6 bootstrap nodes and proceeds through a randomly selected node from the database
Combining to a seed node that is less than 5 days old Finally, when this bonding is completed, the client executes an inquiry (self).
Here self is the SHA3 hash of the NodeID of the client itself. (Seeding lookup(self) is inherited from kademlia)
To populate another node's bucket with the client's new online NodeID The Eclipse attack takes advantage of the fact that the seeding process does nothing when the table is not empty.

Selecting peers (i.e., outgoing TCP connections)

Very roughly speaking, the Ethereum client picks half of the outgoing TCP connections from lookup_buffer and half from the table. Let's see, to be precise, an outgoing TCP connection is set up like this:
- When the Ethereum client boots up, the task executor starts and continues to run
- The task executor populates the client's database and tables and creates up to $[1/2(1 + maxpeers)]$ (default: 13) outgoing TCP connections to other nodes on the Ethereum network.

Task Runner

The task executor has up to 16 concurrent tasks with a queue of tasks that need to run.
Whenever there are fewer than 16 concurrently running tasks, the task executor runs each task in the queue until the maximum of 16 is reached.
New unexecuted tasks (because the maximum of 16 concurrently running tasks has been reached) are pushed to the task executor queue.
There are two types of tasks, dial_task and discover_task.
discover_task
- lookup(t) call (t = random256bit String)
dial_task
- Attempting a new TCP connection or dial connection to another node. Before creating a dial_task for a node, the task runner first runs the following five checks:
  1. not currently being dialed
  2. not already a connected peer,
  3. not itself 4.not blacklisted
  4. not recently dialed

Resolving an unknown IP

dial_task is called on NodeID. In general, since you know the IP address associated with the client NodeID, you can easily initiate a TCP connection. However, there are two rare cases where the client does not know the IP address associated with NodeID n.
1. When NodeID is statically configured by user (without IP address)
2. The IP address field is empty.
In this rare case, the client resolves the NodeID n to its IP address using the traditional Kademlia iterative content resolution process.
That is, call Lookup(n).
The only place where the Ethereum peer-to-peer network uses the repeatable content lookup Kademlia is designed to enable.

Simple Conclusion

The attacker monopolizes all incoming and outgoing connections of the victim, isolating the victim from the rest of the network's peers.
Increase the maximum number of connections to a node and limit the number of hosts to a single IP address.

Reference

Subscribe to Zer0Luck

Receive the latest updates directly to your inbox.

Mint this entry as an NFT to add it to your collection.

Verification

This entry has been permanently stored onchain and signed by its creator.

Arweave Transaction

uFX8MSREJzpXAqN…Ff-g1cOOJQrlS6U

Author Address

0x9b399D329a3CDfB…d4117Bf3cC27a39

Content Digest

MdyKCwYJXRoYesj…m6On3wsIzcpQ8Q8