Seen on C4: Storage Structs

One of the best parts of competing on Code4rena is reading code from many different projects and seeing different styles, designs, and techniques in the wild. This is an occasional series about interesting patterns I've seen on C4.

Seen in Astaria and Drips.

The problem

Upgradeable contracts are fragile: they have changeable bytecode, but immutable storage layouts. This means contract authors must stay keenly aware of any code changes that alter their implementation's storage layout. Accidentally introducing a storage collision in an upgrade is very easy to do, and can have disastrous consequences.

Techniques like storage gaps, explicit inheritance of storage-specific contracts, and external tooling that validates storage layouts at build/deploy time can all help protect against errors. But it's still really easy to mess up upgrades, especially when using libraries that leave many important responsibilities in your hands.

The pattern

Storage structs are a technique that can make working with storage in upgradeable contracts safer and more explicit. This pattern (or parts of it) is also known as ”explicit storage buckets”, ”Diamond storage”, and “unstructured storage”.

It works like this: first, define a struct that represents your contract's storage. For example, the storage of a contract with three top level state variables like this one:

contract DefaultStorageLayout {
  address public admin;
  uint256 public number;
  bytes32 internal hash;
}

Can be represented as a struct like this one:

struct Storage {
  address admin;
  uint256 number;
  bytes32 hash;
}

Next, define a unique slot to store this struct in your contract. Using the hash of a unique string is one way to generate a slot that won't collide with other storage. By hashing a string and converting it to a uint256, we’ll get back a very big pseudorandom number in the 256-bit integer range. Since that range is so large, it’s extremely unlikely that our explicit slot will collide with anything else.

We can define this once, as a constant:

uint256 private constant STORAGE_SLOT = 
  uint256(keccak256("eth.horsefacts.contract.storage")) - 1;

It's common to generate a slot following the format defined in EIP-1967, which defined this method of hashing a string and subtracting one to generate specific slots for proxy configuration addresses. (This EIP was sort of the progenitor of the unstructured storage pattern, since it first stored important data in deterministic-but-unusual slots to prevent storage collisions).

EIP-1967 defined specific deterministic slots for proxy-related storage variables, like eip1967.proxy.admin and eip1967.proxy.implementation. One thing I like about this method is that you can give your slot a friendly human readable name. Any unique string works, but if you miss your old job, you can use a name that looks like an enterprise Java package, like com.mydomain.mycontract.storage.

It's also a good and paranoid practice to subtract 1 from the hashed value, which ensures that the selected slot is not associated with a known hash preimage. (In other words, although the value of the storage slot is known, nobody knows which input hashes to that value).

Once our slot is defined, we can add an internal helper function to load the storage struct from its defined slot. For this we'll need a dash of inline assembly. Assigning to a storage pointer's .slot in inline assembly directly sets the storage pointer to a specific slot address.

function _storage() private pure returns (Storage storage s) {
  // Since STORAGE_SLOT is a constant, we have to put a variable
  // on the stack to access it from an inline assembly block.
  uint256 slot = STORAGE_SLOT;
  assembly {
    s.slot := slot;
  }
}

Our _storage function:

  1. Creates a storage pointer s

  2. sets its slot to STORAGE_SLOT, and

  3. implicitly returns it.

We can now use this internal helper to retrieve a storage pointer any time we need to read or write from storage. For example, we can set initial values in the constructor by loading the struct from storage and assigning them:

constructor(address _admin, uint256 _number, bytes32 _hash) {
  Storage storage s = _storage();
  s.admin = _admin;
  s.number = _number;
  s.hash = _hash;
}

Or call the helper directly from inside another function to access individual values:

function number() external view returns (uint256) {
  return _storage().number;
}

Since our internal _storage() helper returns a storage pointer to our storage struct, this is just as efficient as reading number would be if these values were stored in top level state variables. We are just explicitly defining which of the 2^256-1 storage slots contains our contract's storage, rather than letting the Solidity compiler automatically store it starting at slot zero. Since the storage layout for any inherited contract laid out automatically by the Solidity compiler will start at slot zero, by choosing a different slot guaranteed to be very far from zero we are avoiding the likeliest location for storage collisions.

In addition to protecting upgradeable contracts against accidental storage collisions, one nice thing about this pattern is that it makes storage access explicit. It becomes very clear when a function reads or writes from storage, which is sometimes not obvious when using state variables. Since reading and writing storage is one of the most expensive operations on the EVM, this can be helpful.

However, there are still a few footguns to keep in mind. Unstructured storage is a way to tell the Solidity compiler "put my storage in this slot", but its layout will still follow Solidity's rules for storage variable layout.

That means, among other things:

  • Storage structs are still append-only. Adding a new variable to the beginning or middle of the struct in an upgraded implementation will still result in a storage collision. Don't alter the fields of existing storage structs once they've been created. (But it's OK to carefully add new ones).

  • Since structs are packed tightly in storage, nested structs inside an unstructured storage struct cannot be changed, and adding new fields will cause storage collisions.

If you want to use unstructured storage, it's generally a good practice to try and limit storage structs to simple value types and mappings rather than complex types like nested structs.

(If you’re interested in why mappings are safe, read up on how mappings are implemented in storage , which also answers why it’s not possible to retrieve a mapping’s keys on chain).

Additionally, take care to avoid accidentally loading storage structs into memory, especially if they are large. Always load a storage struct with data location storage to ensure you're using storage pointers rather than loading the full struct into memory, which can be expensive and error prone for large storage structs. (This is a good reason to use an internal _storage() helper function).

Seen on C4

I've seen storage structs in two recent C4 contests.

Radicle Drips v2 uses unstructured storage throughout the codebase. One good example is the central DripsHub contract:

struct DripsHubStorage {
  /// @notice The next driver ID that will be used when registering.
  uint32 nextDriverId;
  /// @notice Driver addresses. The key is the driver ID, 
  /// the value is the driver address.
  mapping(uint32 => address) driverAddresses;
  /// @notice The total amount currently stored in DripsHub of each token.
  mapping(IERC20 => uint256) totalBalances;
}

/// @notice The ERC-1967 storage slot holding a single `DripsHubStorage` structure.
bytes32 private immutable _dripsHubStorageSlot = 
  _erc1967Slot("eip1967.dripsHub.storage");

/// @notice Returns the DripsHub storage.
/// @return storageRef The storage.
function _dripsHubStorage() internal view returns (DripsHubStorage storage storageRef) {
  bytes32 slot = _dripsHubStorageSlot;
  assembly {
    storageRef.slot := slot
  }
}

/// @notice Calculates the ERC-1967 slot pointer.
/// @param name The name of the slot, should be globally unique
/// @return slot The slot pointer
function _erc1967Slot(string memory name) internal pure returns (bytes32 slot) {
  return bytes32(uint256(keccak256(bytes(name))) - 1);
}

/// @notice Returns the driver address.
/// @param driverId The driver ID to look up.
/// @return driverAddr The address of the driver.
/// If the driver hasn't been registered yet, returns address 0.
function driverAddress(
  uint32 driverId
) public view returns (address driverAddr) {
  return _dripsHubStorage().driverAddresses[driverId];
}

function _decreaseTotalBalance(IERC20 erc20, uint128 amt) internal {
  _dripsHubStorage().totalBalances[erc20] -= amt;
}

So does Astaria. The entrypoint AstariaRouter.sol is a good example:

struct RouterStorage {
  uint32 auctionWindow;
  uint32 auctionWindowBuffer;
  uint32 liquidationFeeNumerator;
  uint32 liquidationFeeDenominator;
  uint32 maxEpochLength;
  uint32 minEpochLength;
  uint32 protocolFeeNumerator;
  uint32 protocolFeeDenominator;
  ERC20 WETH;
  ICollateralToken COLLATERAL_TOKEN;
  ILienToken LIEN_TOKEN;
  ITransferProxy TRANSFER_PROXY;
  address feeTo;
  address BEACON_PROXY_IMPLEMENTATION;
  uint88 maxInterestRate;
  uint32 minInterestBPS;
  address guardian;
  address newGuardian;
  uint32 buyoutFeeNumerator;
  uint32 buyoutFeeDenominator;
  uint32 minDurationIncrease;
  mapping(uint8 => address) strategyValidators;
  mapping(uint8 => address) implementations;
  //A strategist can have many deployed vaults
  mapping(address => bool) vaults;
}

uint256 private constant ROUTER_SLOT =
  uint256(keccak256("xyz.astaria.AstariaRouter.storage.location")) - 1;

function _loadRouterSlot() internal pure returns (RouterStorage storage rs) {
  uint256 slot = ROUTER_SLOT;
  assembly {
    rs.slot := slot
  }
}

function getAuctionWindow(bool includeBuffer) public view returns (uint256) {
  RouterStorage storage s = _loadRouterSlot();
  return s.auctionWindow + (includeBuffer ? s.auctionWindowBuffer : 0);
}

function getLiquidatorFee(uint256 amountIn) external view returns (uint256) {
  RouterStorage storage s = _loadRouterSlot();
  return amountIn.mulDivDown(
    s.liquidationFeeNumerator,
    s.liquidationFeeDenominator
  );
}

Note how the examples above both store simple value types and mappings in their storage structs, but avoid complex types like nested structs.

All proxy patterns are advanced techniques, and unstructured storage is no exception. It's important to have a good understanding of the EVM storage model and Solidity state variable layout to use any of them safely. And even if you're using a storage struct, you should still take extreme care if you need to change it as part of a contract upgrade. However, used carefully, they can be an elegant solution to preventing some of the most common causes of storage collisions.

You can read more about this pattern and see another detailed example under "Explicit Storage Buckets" in useful-solidity-patterns.

Previously

Thanks to devtooligan for reviewing a draft of this post, danielvf for his thread explaining the Audius exploit, and merklejerk for his own writeup of this pattern.

Subscribe to horsefacts
Receive the latest updates directly to your inbox.
Mint this entry as an NFT to add it to your collection.
Verification
This entry has been permanently stored onchain and signed by its creator.