A Gentle Introduction to Ethereum Networking overlay stack DevP2…

A Gentle Introduction to Ethereum Networking overlay stack DevP2P

October 24th, 2022

URGENT ALL MINERS: The network is under attack. …a computational DDoS, i.e. miners and nodes need to spend a very long time processing some blocks. …due to the EXTCODESIZE opcode, which has a fairly low gas price but which requires nodes to read state information from disk; the attack transactions are calling this opcode roughly 50,000 times per block. “

A DDoS attack on Geth in 2018, and founder of Ethereum calling for a switch to parity.

This quote piqued my curiosity about the DDoS resistance of a decentralized network. It made me wonder what does DDoS resistance here mean - is it that an individual node is resistant to DDoS through protocol robustness or does it mean that there are enough nodes in the Ethereum network to make sure that individual pawns may get sacrificed but the system runs.

While I did understand that the 32 ETH stake deposit for Block Builders and Validators for having skin in the game and slashing of staked ETH was punishment for bad actors. Combine staking, slashing with gas fee deterrents to prevent actors from transaction spamming. These rules and incentives provided Game Theoretical safety. What wasn’t very obvious to me was how these deterrence mechanisms trickle down to the networking layer of Ethereum.

GMX | Decentralized Perpetual Exchange

Trade spot or perpetual BTC, ETH, AVAX and other top cryptocurrencies with up to 100x leverage directly from your wallet on Arbi…

consensys.net

If Ethereum is a Public, Permissionless Blockchain that allows anybody who knows how to install and run software in a piece of hardware to participate, wouldn’t it be easy to launch DoS attacks ? Here’s these 3 stats according to Certik to help understand the pervasiveness of this issue

There are over 2,000 DDoS Attacks that are observed world-wide DAILY.
One third of all downtime incidents are attributed to DDoS attacks.
$150 can buy a week-long DDoS attack on the black market.

Ethereum nodes have a Public IP Address, most of them running open source clients. Some could be operating behind a Proxy, a sophisticated firewall that supports Deep Packet Inspection, placed in an ISP Network that offers Clean Bandwidth and protection against typical volumetric DDoS attacks. But that is not the default expectation or vision for a participant’s experience.

If patching a fleet of servers with a bugfix in a centralized environment such as a company where there is no choice but to comply - is a pain, I couldn’t fathom to imagine the state of patches in the nodes running Ethereum. Ethereum does focus on client diversity so that bugs in one client type does not bring the entire network down but it also notable that Ethereum suffers from a concentration risk problem.

Turns out some of these questions and concerns are at least partially addressed in the protocol design of DevP2P. This article is the first amongst a series that takes a look into the internals, and a summary of what I found when I plunged headlong into the rabbithole of the Ethereum Networking Stack. These findings are a result of running Wireshark Utilities on a Full Node a Prysm Consensus + Besu Execution client without depositing 32 ETH or running the validator process. This specific article is a gentle introduction to the history of Overlays and the Ethereum Networking Overlay stack - DevP2P.

Ethereum is a peer-to-peer network with thousands of nodes that must be able to communicate with one another using standardized protocols. The "networking layer" is the stack of protocols that allow those nodes to find each other and exchange information. This includes "gossiping" information (one-to-many communication) over the network as well as swapping requests and responses between specific nodes (one-to-one communication). Each node must adhere to specific networking rules to ensure they are sending and receiving the correct information.

History of Overlays

"Mr. Watson come here. I want to see you."

Alexander Graham Bell spoke to Watson, his assistant and set off a momentous revolution in the world of networks by speaking through electromagentism. William B Coy, inspired by Alexander Graham Bell’s lecture operationalized the world’s first commercial telephone exchange in New Haven, Connecticut in April 1878. The exchange featured a central switchboard, to allow any of it’s 21 clients to talk to each other if they had a telephone. For several decades while the capability and functionality of these switches improved exponentially the underlying objective was essentially the same. Create a closed circuit or connect the line between two parties that wished to speak to each other. This type of switching was called circuit switching. The Public Switched Telephone Network (PSTN) was born by connecting sections of clients and the exchanges that connected them.

The internet as an Overlay

The internet is an overlay on the phone network. In the early 1930s the telegraph printers also known as teleprinter, that evolved from the iterative innovations in the field of telegraph, was adapted and overlaid to run on a circuit switching network to facilitate point to point or even multipoint communication. And then came fax machines. The early internet (ARPANET) was overlaid on Public Switched Telephone Network (PSTN). The early pioneers of the internet innovated on computer nodes and the hardware at the edge that connected to this network while piggybacking on the extensively available telephone network, TCP/IP was born, Tim Berners Lee creates the hypertext protocol at CERN unleashing the era of the internet dubbed as Web1.0. The internet was so consequential that it nudged the Jurassic telephone network to adapt its core based on the IP. This phenomenon is called inversion. Regulation and a best effort service model allowed the IP layer to be overlaid on any Network link layer and below. This open model created unprecedented permission-less innovation.

The Internet is Overlaid

The internet movement spawned a new breed of innovators that always believed in the vision that the internet would be the path to a permissionless, decentralized, censorship-resistant future. These innovators used overlay networks on top of the internet to establish decentralised peer to peer (p2p) systems. Here’s this lovely illustration by Paratii that traces the history and contributions of the p2p movement.

The structure of the Ethereum Overlay Network

The Ethereum overlay network is built on both TCP and UDP. Not very different from how we use DNS over UDP to discover the address of the server, and TCP / TLS / HTTP for client server communication,

DNSSEC adds a layer of security by adding cryptographic signatures to existing DNS records, and TLS allows for Perfect Forward Secrecy in communication over an insecure channel namely the internet, Ethereum uses ECIES for communication and Bonding during discovery in clever ways to establish the same outcome without a central / centralised certificate authority.

UDP for discovery
TCP for everything else

How does discovery work

1. Ping Pong and Bonding over UDP
2. Secured key exchange using ECIES on RLPX on TCP
3. Find Neighbours and Get List of Neighbours
4. Iterate over list of nodes and progress

Standing on the shoulders of Giants

The overlay network of Ethereum DevP2P borrows heavily from several innovations that occured in adjacencies such as

Public-Key Encryption ( PKE ) & Elliptic Curve Cryptography ( ECC )
Kademlia Distributed Hash Tables ( k-DHT )
TCP / IP


+----------------------+------------------------------------------+
| Foundational Element |                 Purpose                  |
+----------------------+------------------------------------------+
| PKE & ECC            | Addressing Scheme & Secure communication |
| K- DHT               | Node discovery & Bootstrapping           |
| TCP/IP               | Permissionless innovation                |
+----------------------+------------------------------------------+

PKE & ECC

The first step towards participation is communication is discovery and the first step towards discovery is having a unique identifier / address. Ethereum Nodes can have addresses thanks to public key cryptography. Ethereum uses elliptic key cryptography for its

Addressing system
Secure P2P communications

Addressing System

Ethereum’s overlay network has to be self organising from an addressing perspective.

There is no central coordinating entity that should control who gets what address and how the network gets organised,
Has to consider a large enough to accommodate enough types of actors ( Wallets, Nodes, Smart Contracts etc.)
Collision resistant and governed by randomness - the chance that 2 people at any given point in time will not get to the same address

Encryption

While the earliest known forms of encryption have been around since Caesar’s time known as Caesar’s cipher, encryption and secure communication always warranted preshared secrets. Which means the two parties or multiple parties that needed to communicate had to have a pre-established secure channel that facilitate secure exchange. This was the Achilles heel. The idea of public key cryptography changed all of that and was revolutionised by Martin Hellman, Whittfield Diffie and Ralph Merkel in 1976 at Stanford while they theorized about the Knapsack problem.

Knapsack problem theorized by Diffie, Hellman & Merkel

While Elliptic Curves have fascinated humankind since 2nd Century AD, first officially described by Diophantus - known as diophantine equations, Fermat’s last theorem. Despite studying them for 2 Millenia, we found mainstream use for Elliptic curves in cryptography in 1985 when Victor Miller and Neal Koblitz introduced Elliptic Key Cryptography.

y^2 = x^3 + ax + b (mod p)

The curve behind all the magic

Why not RSA or something else ?

To understand this in terms of intuition lenstra, a researcher introduced this concept of expressing effort to break security in terms of energy / computational cycles required to do so.

Breaking a 228-bit RSA key requires less energy to than it takes to boil a teaspoon of water. But breaking a 228-bit elliptic curve key requires enough energy to boil all the water on earth. For this level of security with RSA, you’d need 10x the bit length.

Multiple ways of visualizing Key and Node address Generation

A visualisation of the geometry of how the key-pair generation happens

Arbitrary / Random 256 bit number from some RNG or PRNG
Use that to multiply the base point of Gx and Gy (x,y) coordinates defined by the secp256 k1 curve
Get the Public Key (x,y) on the curve.
Extract 20 bytes from the public key to get the node address.

If we had to write program to perform this for us, how would it look ?

Programmatic implementation of Keypair generation

The node address is 20 bytes / 160 Bits long (extracted from the public key)

The public key itself if 512 bits / 64 bytes long
The private key is 256 bits / 32 bytes long

The Node Address space is a 160 bit address space meaning there could be all of 2^160 addresses (including smart contracts, wallets and nodes).

How do nodes securely communicate with each other

Over and above encryption, signatures have been an essential part of ensuring that a sender cannot deny sending a message, given a private key is kept secret. Signatures give reasons for the receiver to believe the data was sent by sender and only the sender.

+-------------------+------------+
|      Outcome      | Technique  |
+-------------------+------------+
| Authenticity      | Signatures |
| Non-repudiability | Signatures |
+-------------------+------------+

2 phase system. There’s signing which the sender does and verifying the signature which the receiver does.The packets are dropped if the signature is invalid.

Sign = Signature Fn ( Kpriv, Hash Fn( Kpub, MsgType, Msg ) )

Cryptographic Signing process of Payload

Kpub = Erecover ( Sign )
isValid = Validation Fn ( Kpub, Sign )

Packet Structure

All UDP packets used for discovery follow the structure described here. About 98 bytes are used for Packet integrity, Packet authenticity and packet Identification followed by the data payload of arbitrary length that is RLP encoded ( parlance to json serialization and deserialization )

For example ping contains the following fields

1. From Node ( IP, TCP Port, UDP port )
2. To Node ( IP, TCP Port, UDP Port ) 
3. RLP version

Kademlia Distributed Hash Table

The Kademlia DHT was adapted to be used in Ethereum for addressing only as it is not a content storage p2p network because of a few properties

1. O Log (n) search complexity
2. XOR distance metric 
3. Fixed size routing Tables

OLog(n) Search Complexity

XOR distance metric logic

XOR is a good distance measurement metric as it allows for some important properties. If there are 2 nodes (A,B) that have random addresses, the XOR distance between A,B is the same as B,A. The distance to itself is 0, if A is not equal to B then the distance is non-zero, it also satisfies something known as triangle inequality metric. Which states that for a third non-zero point C , the sum of distance between A,B and B,C is always greater than A,C.

Routing Table

The address space is 160 bits, and each bucket corresponds to nodes that share the one bit value from LSB all the way to MSB, with a list of nodes in each bucket that have similar XOR distances. Assume that the node address is 11………1. Each bucket will have a sorted list of nodes last responded and duration of time a given node is present in the routing table. The Ethereum clients may limit the maximum number of peers and also the discV5 implementation keeps exactly 16 nodes per bucket sorted.

ECIES Key Exchange

ECIES stands for Elliptic Curve Integrated Encryption Scheme is a part of a family of encryption systems called integrated encryption scheme.It builds on top of Diffie Hellman exchange the concept is quite simple, both Alice and Bob derive a common key without needing to share any information over an insecure channel.

Where does the overlay network reside

Every active participant in Ethereum that runs a node of any kind ( Light , Full, Archival, Validator, Relayer ) or any other types of nodes that could arise in the future, must run clients. Almost all the specifications of Ethereum foundation are codified as implementations in these clients. The clients are open sourced so anyone is free to modify and create their own version of the client. The overlay network DevP2P is a part of the client(s). So when you install the client binaries in a virtual machine and start the process, the nodes start communicating with other nodes after following a series of steps. Some notes on what are the components that actually make up an Ethereum node. Unlike a typical client server network, Ethereum’s p2p network has a client client network where all clients of a given type are born equal and build their reputation based on the time and behaviour in the network.

How does all of this come together

Some of the constructs deeply embedded in the protocol design makes Ethereum network and even an individual node decently resistant to basic attacks like eclipse, Sybil etc. Even volumetric DDoS attacks if the packets get dropped in the TCP stack, reduces the impact on the application. Of course this is far from perfect.

1. Eth clients reached out to hardcoded bootstrap node run by eth foundation 
2. Eth clients bond with bootstrap node to get peers
3. Eth clients ping/ping with multiple peers to sync with the chain
4. The consensus client and execution client have their own p2p overlay networks
5. Node IDs can be different, ENRs are different between consensus and execution clients 
6. Without a signed Ping / Pong Bonding your packets will be dropped
7. All further find nodes and enumeration happens after ECIES key exchange which means arbitrary packet floods will get dropped
8. Every k-bucket has a max of 16 nodes as peers and limited by max peers parameter in the config file of the client
9. This list is again sorted based on time on node’s routing table and time to respond.