Validating From Hardware Enclaves

Special thanks to Jason & Amir from Puffer for review and discussion

In the following, we will take a closer look at secure-signer, a remote-signing tool protecting validator keys and adding a slashing protection fence inside trusted hardware.

We will also go through setting up a validator using secure-signer on ephemery testnet.


UPDATE: Here’s a 15min devconnect2023 TL;DR of this blogpost


Quick recap: What to protect against? And how to protect validator keys?

Node operators protect their validator keys with the aim to mitigate slashing risks, possibly occurring in cases like:

  • accidentally running the same keys simultaneously in two places

  • a compromised host/node (or the accessing client)

  • a bug in a consensus client implementation

  • physical node theft

  • a rogue cloud provider or staking service employee

    → and subsequent ransom attacks

Most common protection schemes can be categorised as follows:

src: https://www.attestant.io/posts/protecting-validator-keys/ - slightly modified
src: https://www.attestant.io/posts/protecting-validator-keys/ - slightly modified
  • Distributed key generation & threshold signing, e.g. Dirk / Vouch in a geographically distributed multi-region setup

    or similarly: Distributed Validator Technology (DVT), e.g. Obol’s

  • Remote-signing with web3signer, i.e. separating validator keys from the validator client, often hosted in cloud-based secure vaults like AWS or Azure; or rather exotic at-home setups with a dual-lan port NUC connected to an “offline” Raspberry Pi holding the validator keystore

  • Remote passphrase: fully or partially encrypted internal or external hard drives storing the keystore decryption passphrase

  • Local encryption: air-gapped generated keystore file + decryption passphrase stored “hot” on disk / in memory - usually the default setup for at-home staking

  • Validator keys generated and stored in hardware enclaves with secure-signer, i.e. not on drive, but in encrypted memory

What does validating from hardware enclaves even mean?

“[...] staking via trusted hardware [is a method], where your staking key would be within a system that is very difficult to exfiltrate keys from [...]” - Vitalik

Essentially, a hardware enclave mimics hardware wallet functionality in order to protect validator keys at rest, but on top of that makes it even “smarter” in the sense that it adds anti-slashing protection at runtime.

One of the core primitives is the generation of validator keys within a hardware enclave, from which they cannot be extracted, but from which validators can perform their consensus duties.

By contrast to resource intensive privacy-enhancing computational techniques like multi party computation (e.g. for DVT) or fully homomorphic encryption, hardware enclaves are capable of arbitrary, inexpensive computation almost at the speed of the native CPU. This provides significant performance advantages, particularly low latency.

The initial concept for staking from secure enclaves stems from Justin Drake’s ethresear.ch post, proposing to use trusted hardware for trustless staking pools:

What is secure-signer?

Secure-Signer is an open-source validator remote-signing tool built by the Puffer team. It was originally supported by an Ethereum Foundation grant.

Secure-Signer functions as an independent implementation of Consensys' Web3Signer that shifts validator key management and signing logic to hardware enclaves, separating them from the consensus/validator client.

secure-signer in a nutshell
secure-signer in a nutshell

This means that validator keys are generated and stored within SGX's encrypted enclave memory (see below). Also, the hardware enclave provides guarantees that it will not sign a slashable offense by maintaining an integrity-protected database of previously signed messages.

What is SGX / TEE / Remote Attestation?

Intel’s Software Guard Extension (SGX) is a technology built into certain Intel processors allowing users to encrypt their data on an isolated and protected portion in memory (called an enclave).

The enclave ensures that the code is executed as expected without tampering and that the data remains encrypted, not leaving the underlying CPU’s memory - not even being accessible to the OS or the system administrator. The physical hardware ensures that these properties hold. More broadly speaking and in essence, such trusted execution environments (TEEs) provide confidentiality and integrity guarantees on program code and data.


Secure hardware elements can be found in most modern smartphones or hardware wallets, they generally however do not support verifiable remote attestations:

Validators can make use of Intel attesting to the fact that only secure-signer - the program - is running within an Intel manufactured hardware enclave, and thus the validator key and slashing-protection logic & database. The untrusted host (or potentially other third parties) interacting with the enclave subsequently verifies Intel’s attestation in order to gain trust in the enclave.

This so-called remote attestation feature by Intel is not strictly necessary when setting up secure-signer in SGX as long as node operators trust themselves when interacting with the enclave. Nevertheless, it serves as a useful sanity check and proof, gaining increased confidence that the latest version of secure-signer indeed runs correctly in an SGX enclave with up-to-date firmware

Liquid solo-staking pools, as one of several examples by contrast, necessarily require these (publicly) verifiable remote attestations in order to gain trust in node operators.


What is secure about hardware enclaves anyways?

From a blockchain purist's perspective, there are arguments against relying on a centralised entity like Intel for highest-grade security. For one, users trust Intel’s technology which itself is proprietary and closed source. Furthermore, SGX has been subject to vulnerabilities in the past. Its reputation is not necessarily the best.

Nonetheless, it can be argued that the use of SGX may serve as a purely additive security layer to existing staking setups. SGX is widely available, a tool at hand today, that can expand the validator key protection design space and increase remote-signing software diversity, ultimately benefiting the network as a whole.

At the end of the day, stakers need to define their individual threat models and make trust assumptions with regards to the technologies they use, see for example the Ledger firmware debate earlier this year.

It is also worth pointing out other SGX use cases in the blockchain context such as the secret network, encrypted mempools like SUAVE, running Geth / Reth / a block builder inside SGX, permissionless (re-)staking pools as well as multi-proving and -verifying systems, e.g. for zkRollups.

How could node operators benefit from validating with secure-signer?

Because it runs in an isolated, tamper-resistant hardware enclave, secure-signer provides security enhancements with regards to avoiding slashable offenses and hence adds extra levels of defense against risks such as:

  • accidentally running the same keys twice, i.e. equivocations (keys only ever exist once within one unique enclave)

  • due to physical theft of the hardware/node/enclave (s.o.)

  • a critical consensus client bug (secure-signer makes use of a dedicated, anti-tamperable slashing-protection database managed within the enclave)

  • a directly compromised host/node (e.g. in case of a hack, OS 0-days or kernel-level malwares)

  • a compromised client accessing/SSH-ing into the host/node (e.g. because of parallel usage for browsing/e-mail/entertainment activities)

    → and subsequent ransom attacks

All in all, secure-signer not only protects against hackers, but also against accidental key leakage or rogue system administrators as well as potential ransom attacks.

It is also at-home staking friendly, since it can be run on consumer grade hardware (NUCs) or Xeon servers, reducing dependency on cloud solutions. The remote attestation feature moreover increases trust in solo-stakers by third parties, expanding the design options for permissionless (re-)staking pools.

Last but not least, the more validator keys run in TEEs, the lower the risk of equivocations and thus a highly correlated mass-slashing event, benefitting the health of the network.

What is on the roadmap for secure-signer?

→ The secure-signer codebase has recently been released and is still in alpha. It needs review & audit.

→ Currently only Intel’s SGX enclave technology is supported, but AMD SEV support and similar solutions like AWS Nitro Enclaves are in the works.

Note: since Intel deprecated SGX support in its consumer product line in 2021, the spectrum of available at-home hardware becomes more limited

→ As staking from TEEs is complimentary with DVT, the integration with DVT providers is a work-in-progress.

→ Secure-Signer UI/UX in its current form is not best suited for mass key-management (feel free to report issues and/or get involved)


How to run a validator with secure-signer on ephemery testnet?

This guide mainly follows the official secure-signer setup documentation. However, we make some specific adjustments for:

  • running secure-signer on at-home hardware (NUC) - no cloud!

  • adapting the networking configuration for ephemery testnet, because we want to sync the node within minutes and have access to easily obtainable testnetETH

  • running the full node stack (execution + consensus + validator client & secure-signer) in docker compose on one machine (ephemery requires < 1GB of disk space)

For the very basics you may want to take a look at the fundamentals of running a node in Docker first.

Prerequisites

  • Ubuntu 22.04 + Docker Compose for software orchestration (generally no necessity!)

  • SGX capable hardware:

    This guide has been tested on a 4-core NUC7CJYH (Celeron J4005) which is capable of SGX (and FLC), averaging a decent 5w (TDP10) power consumption.

    • Occlum LibOS, a framework enabling applications to run on SGX -currently used by secure-signer- requires SGX-related Flexible Launch Control (FLC) technology

      note: this guide has also been unsuccessfully tested on a given NUC10, because it was lacking FLC

    alternatively: testing may happen on Azure Cloud

    (Although defeating the purpose and hence not part of this guide: you can test secure-signer as a pure Rust app with non-sgx-compatible hardware)

  • SGX activated in BIOS/UEFI settings

    • 128MB isolated memory proves to be sufficient in order to run secure-signer & the validator key(s) + the slashing-protection database
  • Installed SGX drivers (for Linux served as Kernel module)containing Intel’s Platform software, providing the runtime environment for enclaves - named Architectural Enclave Service Manager (AESM)

NUC-7PJYHN (manufactured in 2022) + 8GB RAM + 256 GB 2,5” SATA SSD
NUC-7PJYHN (manufactured in 2022) + 8GB RAM + 256 GB 2,5” SATA SSD

Some SGX Vocabulary

Without going too deep down the SGX rabbit hole, the following SGX vocabulary should be noted:

  • MRENCLAVE = hash value of the program binary in the enclave - to us representing the identity of the SGX enclave

    • verifying this value assures us to run only the correct version of secure-signer on SGX, and provide it to the client CLI, the user interface for secure-signer when interacting with the enclave

    • the value may change depending on the used version of secure-signer; cross-checking MRENCLAVE can either be done via self-building or by comparing with developer-signed and -published values, e.g. on-chain or GitHub

  • MRSIGNER = hash value representing the identity of the author/creator of the SGX enclave


Setting up an ephemery node

In our /home directory we create a project folder /ephemery with subfolders:

mkdir -p ephemery/{geth,lighthouse-bn,lighthouse-vc,JWT,testnet-all}

Generate a JasonWebToken (JWT) so that the execution and consensus client can communicate securely:

openssl rand -hex 32 | tr -d "\n" > "$(pwd)/JWT/jwtsecret"

Obtain and extract the necessary testnet files (genesis info, boot nodes, etc) for the current ephemery iteration (at the time of writing #92):

(We hope for client team’s support of “–network=ephemery” in the future.)

cd ephemery/testnet-all
wget https://github.com/ephemery-testnet/ephemery-genesis/releases/download/ephemery-92/testnet-all.tar.gz
tar -xzf testnet-all.tar.gz

Preparations for testing on ephemery

We test on ephemery, because the hardware requirements are comparably low, allowing us to run a full node on a slightly underpowered SGX-capable NUC7 for easier demonstration purposes. It is moreover extremely easy to obtain testnetETH. Limited state and block history will allow for syncing the full node within minutes.

However, secure-signer currently lacks out-of-the-box support for ephemery. We thus need to create a network configuration file ephemery_network_config.json which we will manually import into the secure-signer container at a later stage:

{
   "network_name": "ephemery",
   "deposit_cli_version": "2.3.0",
   "fork_info": {
       "fork": {
           "previous_version": "0x1000101b",
           "current_version": "0x1000101b",
           "epoch": "0"
       },
       "genesis_validators_root": "0x9c11ae92a2ddfc3122ffcc1e6c19297b6767ea6436e45a85c80cc8bf48646bab"
   }
}

Note: Ephemery’s GENESIS_FORK_VERSION is static (0x1000101b), but the GENESIS_VALIDATORS_ROOT changes every iteration. Retrieve the latest iteration's value either per “GET /eth/v1/beacon/genesis” RPC-call or as part of the nodevars_env.txt in the ~/ephemery/testnet-all directory downloaded earlier.


Running the node

We find below a generic docker-compose.yaml which we download in our project directory, opting for Geth as execution client, Lighthouse as consensus & validator client and secure-signer as remote-signer:

version: "3.4"

volumes:
    Secure-Signer-Backup:

services:
    secure-signer:
        image: pufferfinance/secure_signer:latest
        network_mode: host
        container_name: secsigner
        restart: on-failure
        devices:
            - /dev/sgx/enclave:/dev/sgx/enclave
            - /dev/sgx/provision:/dev/sgx/provision
        volumes:
            - Secure-Signer-Backup:/Secure-Signer
            - /var/run/aesmd:/var/run/aesmd
        command: >
            /bin/bash -c "occlum run /bin/secure-signer 9001"

    validator:
        image: sigp/lighthouse:latest
        network_mode: host
        container_name: vc
        restart: on-failure
        volumes:
            - ./lighthouse-vc:/root/.lighthouse
            - ./testnet-all:/ephemery
        command: >
          lighthouse
          --testnet-dir ephemery
          validator
          --beacon-nodes http://localhost:5052
          --suggested-fee-recipient 0x0000000000000000000000000000000000000000
          --init-slashing-protection

    consensus:
        image: sigp/lighthouse:latest
        network_mode: host
        container_name: lighthouse
        restart: on-failure
        volumes:
            - ./lighthouse-bn:/root/.lighthouse
            - ./JWT:/JWT
            - ./testnet-all:/ephemery
        command: >
          lighthouse
          --testnet-dir ephemery
          beacon_node
          --datadir /root/.lighthouse
          --eth1
          --http
          --validator-monitor-auto
          --execution-endpoints http://localhost:8551
          --execution-jwt /JWT/jwtsecret
          --boot-nodes enr:-Iq4QNMYHuJGbnXyBj6FPS2UkOQ-hnxT-mIdNMMr7evR9UYtLemaluorL6J10RoUG1V4iTPTEbl3huijSNs5_ssBWFiGAYhBNHOzgmlkgnY0gmlwhIlKy_CJc2VjcDI1NmsxoQNULnJBzD8Sakd9EufSXhM4rQTIkhKBBTmWVJUtLCp8KoN1ZHCCIyk,enr:-Iq4QIc297-de1P6hznMX2cIdVsQkve9BD9NUsJ7vVQa7eh5UpekA9rLid5A-yLiS3gZwOGugYZPi58x76zNs2cEQFCGAYhBJlTYgmlkgnY0gmlwhEFtmi6Jc2VjcDI1NmsxoQJDyix-IHa_mVwLBEN9NeG8I-RUjNQK_MGxk9OqRQUAtIN1ZHCCIyg

#         --checkpoint-sync-url https://checkpointz.bordel.wtf/

    execution:
        image: ethereum/client-go:stable
        network_mode: host
        container_name: geth
        restart: on-failure
        volumes:
            - ./geth:/root/.ethereum
            - ./JWT:/JWT
        command: >
          --datadir /root/.ethereum
          --http
          --authrpc.jwtsecret /JWT/jwtsecret
          --networkid 39438092
          --syncmode full
          --bootnodes enode://0f2c301a9a3f9fa2ccfa362b79552c052905d8c2982f707f46cd29ece5a9e1c14ecd06f4ac951b228f059a43c6284a1a14fce709e8976cac93b50345218bf2e9@135.181.140.168:30343

Make sure to adjust Geth’s networkID to the current ephemery iteration (find the number here, or look for chainId in ~/ephemery/testnet-all/genesis.json)

For bootnodes look in ~/ephemery/testnet-all/boot_enr.txt


We then provide Geth with ephemery’s genesis state (note: prompt from this from the project directory):

docker run -it -v $(pwd)/geth:/root/.ethereum -v $(pwd)/testnet-all/genesis.json:/genesis.json ethereum/client-go:stable --datadir /root/.ethereum init genesis.json

Afterwards, start the execution & consensus client and sync ephemery:

docker compose up -d execution
docker compose up -d consensus

Starting secure-signer

We start the secure-signer container, automatically creating a persistent docker volume which permanently stores encrypted enclave data (e.g. for the purpose of re-instantiating the enclave after a reboot).

This command also initiates the hardware enclave via Occlum library OS which is part of the secure-signer docker image. The started secure-signer http-server will later communicate with the validator client (VC):

docker compose up -d secsigner

Afterwards, retrieve (and verify) the “enclave ID” (MRENCLAVE value) in order to make sure the enclave indeed runs secure-signer:

docker exec secsigner /bin/bash -c "cat MRENCLAVE"

Currently, a version upgrade of secure-signer would require a withdrawal/re-deposit since the enclave creation policy in SGX ties the program version to the enclave ID (MRENCLAVE). It is intended to switch this policy to the author of an enclave (MRSIGNER) so that stakers can upgrade/patch secure-signer without losing access to the enclave/their keys.

Next, import the ephemery_network_config.json created earlier into the secure-signer container:

docker cp ephemery_network_config.json secsigner:/home/conf

Generating a validator key in secure-signer

Via the client CLI we may now generate a new validator keypair together with a slashing protection database and perform remote attestation:

docker exec -w /home secsigner /bin/bash -c "./client --bls-keygen --mrenclave ecd4d348d97cebb93ebc6b65bd7675e2861ddcb0f7ad9178b903600672561594"
by default Intel attests to the key generation
by default Intel attests to the key generation

Afterwards, list and review the freshly generated key(s) secure-signer is safeguarding:

curl http://localhost:9001/eth/v1/keystores

Generating deposit data

The deposit data is generated within the enclave following the consensus specs and is signed with the specified validator key. Next, we extract the output file deposit_data.json to our project directory and inspect it. Note that the --execution-addr needs to be a regular ETH-address:

docker exec -w /home secsigner /bin/bash -c "./client --deposit --config conf/ephemery_network_config.json --validator-pk-hex 0xa5...58bd --execution-addr 0xd8...6045"
docker exec -w /home secsigner /bin/bash -c "cat ss_out/deposit_data.json" > ~/ephemery/deposit_data.json
cat ~/ephemery/deposit_data.json

Depositing to the beacon chain

Obtain EphETH from one of the faucets. Go to the ephemery launchpad and drop the generated deposit_data.json.

Note: it will take just about ~7h for your deposit to be processed (because ephemery comes with an adjusted ETH1 follow distance)


Configure the validator client

While we await activation we make the Lighthouse-VC aware of secure-signer by configuring its validator_definitions.yml (following the web3signer specification):

---
- enabled: true
voting_public_key: 0xa5...58bd
type: web3signer
url: "http://localhost:9001"

We point at the localhost since the VC and secure-signer share the same host. (This process step has also been tested with a Teku-VC, also without TLS/SSL communication) Note: initially, the Lighthouse-VC might be run with the –init-slashing-protection flag


Start the validator client

We then start the VC and further await the activation of the validator:

docker compose up -d validator

→ Et voilà: we should by now be validating from a hardware enclave

suggestion: modify the setup and test pointing the VC to a remote BN, i.e. not on the same machine, e.g. on Goerli or Holesky testnet


Coda: generating a voluntary exit message

It should be considered best practice to pre-sign voluntary exit messages in case you ever lose access to the enclave.

docker exec -w /home secsigner /bin/bash -c "./client --withdraw --config conf/ephemery_network_config.json --validator-pk-hex 0xa5...58bd --epoch 123 --validator-index 4567"

–epoch representing the earliest epoch when the voluntary exit can be processed, and –index the validator index

Finally, we extract the output file voluntary_exit_message.json to our project directory and inspect it:

docker exec -w /home secsigner /bin/bash -c "cat ss_out/voluntary_exit_message.json" > ~/ephemery/voluntary_exit_message.json
cat ~/ephemery/voluntary_exit_message.json

Current caveats

  • As mentioned above, in the context of solo-staking it seems reasonable to add/switch the enclave instantiation policy to allow for upgrading/patching secure-signer without losing access to an existing enclave

  • Importing keys into secure-signer is currently buggy

  • A feature to enable TLS/SSL-communication between the VC and secure-signer may be added in the future (http is the default as of now)

    • note: in the above setup, we are neither exposing the VC nor secure-signer to the internet
  • By default, secure-signer listens on 127.0.0.1 (hard-coded into the enclave binary), which is why we set the Docker network mode to host, as opposed to e.g. a bridge network listening on all interfaces (0.0.0.0). This can be edited directly in the source code but requires recompiling the enclave following the developer instructions (which seems unpractical)

  • Support for other testnets apart from Goerli, i.e. Holesky & Ephemery, is not natively integrated yet


****

Please keep in mind this is a testnet guide that may contain mistakes and that takes shortcuts which come with trade-offs. It could quickly become outdated as it’s subject to ever evolving network and software changes.

Subscribe to Ladislaus
Receive the latest updates directly to your inbox.
Mint this entry as an NFT to add it to your collection.
Verification
This entry has been permanently stored onchain and signed by its creator.