Gas Optimization - Yul/Assembly

Cover Image by DESPOINA MATSINOPOULOU from Pixabay

Introduction

Recently, I followed the Advanced Solidity Bootcamp organized by Encode Club and delivered by Extropy.

Among the things we learnt, it was about gas optimization. The two many areas we touched in gas optimization were:

  • Storage optimization

  • Converting Solidity to Yul/Assembly.

Then I decided to apply this knowledge to one Talent Protocol production smart contract. The TalentCommunitySale.sol published in this repository here.

Note: If you want to jump into the code, click here.

Business Logic

A quick note on the business logic of this contract.

Talent Protocol published this contract to allow for interested builders to pre-purchase $TALENT tokens by depositing an amount of $USDC.

There were 4 different tiers:

  • Tier 1: 100 USDC

  • Tier 2: 250 USDC

  • Tier 3: 500 USDC

  • Tier 4: 1000 USDC

The picture below tries to depict the main business flow:

Main TalentCommunitySale Business Flow
Main TalentCommunitySale Business Flow

Assuming that Bonnie wants to deposit 100USDC, she would call TalentCommunitySale#buyTier1()

There are requirements for this transaction to succeed:

  • The Sale should be active. Note that the Sale ran for a specific period of time in July 2024 (you can see the public contract here)

  • Bonnie should have enough $USDC, i.e. at least 100 for tier 1.

  • Bonnie should have approved TalentCommunitySale contract to spend as a spender 100USDC on her behalf.

  • Also, for each tier, Talent Protocol, had specified a maximum number of sales. This was registered in the contract itself. After this number, tier was considered sold out.

  • Finally, the same buyer was not allowed to purchase more than once, even if it were on different tiers.

All these business rules were coded as part of each buyTierX() function in the original contract code. Here is the related snippet for buyTier1():

require(saleActive, "TalentCommunitySale: Sale is not active");
require(
    paymentToken.allowance(msg.sender, address(this)) >= 100 * 10**tokenDecimals,
    "TalentCommunitySale: Insufficient allowance"
);
require(tier1Bought < TIER1_MAX_BUYS, "TalentCommunitySale: Tier 1 sold out");
require(!listOfBuyers[msg.sender], "TalentCommunitySale: Address already bought");
require(paymentToken.transferFrom(msg.sender, receivingWallet, 100 * 10**tokenDecimals), "Transfer failed");

Test Coverage

Before trying to optimize a Smart Contract code, and as a good practice in refactoring any code in software engineering, one should have very good test coverage.

The original smart contract had some coverage with tests written in Hardhat and Ethers.

I decided to go with foundry, mainly because that was the tool we were taught in the bootcamp.

Test Source Code

The test source code can be found here.

Test Main Takeaways

Dependency on an ERC20 Token

When your contract depends on an ERC20 Token, you can use a Mock smart contract while testing. This is the purpose of the ERC20Mock.sol (and the ERC20MockBad.sol which is specially used for testing ReentrancyGuard)

Test 3rd-Party Inherited Contracts

When a contract derives from 3rd-Party inhered contracts, like OpenZeppelin Ownable and ReentrancyGuard, you have to write tests for all the public features/functions these contracts make your contract expose. This is because inheriting from these 3rd-party contracts is an implementation detail, which you might decide to change in the future, but still want to keep the functionality.

This is why you will see tests that test the Ownership features inherited from Ownable and a couple of tests testing ReentrancyGuard.

Testing ReentrancyGuard

This deserves its own paragraph in this post, because I had to use a special ERC20MockBad.sol contract to simulate the behavior of a bad contract whose function is trying to call back to the calling function on the calling contract.

Calling
Calling

I implemented the ERC20MockBad.sol contract to call back to the buyTier1() function that is calling the bad’s contract transferFrom() function:

function transferFrom(address, address, uint256) public override returns (bool) {
    msg.sender.functionCall(abi.encodeWithSignature("buyTier1()"));

    return true;
}

Testing the re-entrancy had one glitch. The following code should have worked but it didn’t:

vm.prank(caller);
vm.expectRevert(abi.encodeWithSelector(TalentCommunitySale.ReentrancyGuardReentrantCall.selector));
talentCommunitySaleBad.buyTier1();

The error that it threw was:

Strange Error on Reentrancy Guard Test
Strange Error on Reentrancy Guard Test

In order to sort this out, I had to write:

vm.prank(caller);
try talentCommunitySaleBad.buyTier1() {}
catch (bytes memory err) {
    assertEq(bytes4(err), TalentCommunitySale.ReentrancyGuardReentrantCall.selector);
}

Here, I catch the error and I check that the first 4 bytes of the error caught are equal to the selector for the custom error TalentCommunitySale.ReentrancyGuardReentrantCall.

Test Coverage

Foundry has a very good test coverage report:

One can run:

$ forge coverage

and Foundry will run the tests and print the report. A report like this:

TalentCommunitySale - Test Coverage Report
TalentCommunitySale - Test Coverage Report

Printing Messages to Console

Another very useful utility that helps debugging inside your tests is the console utility, which you can import from forge-std/Test.sol.

USDC Dependency

The USDCTMock is an ERC-20 compatible contract that I used in my tests, instead of the real USDC contract.

Main Foundry Cheat Codes

The main Foundry cheat codes that I used:

  • vm.prank() to set the msg.sender of the following transaction.

  • vm.expectRevert() to set revert expectations.

  • vm.expectEmit() to set emit event expectations.

Storage Layout

Initially, I did the exercise of reducing the number of slots the contract occupied.

I can get the storage layout and information about how state variables use the storage with the following Foundry command:

$ forge inspect --pretty TalentCommunitySale storageLayout

Initially, before I do any change, the storage layout was:

Initial Storage Layout
Initial Storage Layout

As you can see, it occupied 8 slots.

At the end of the exercise, the storage layout became this:

Storage Layout After
Storage Layout After

As you can see, the optimized storage, uses 3 less slots. Instead of 8, it uses 5.

Code optimizations

With regards to code optimization, as I said at the beginning, I converted a lot of Solidity code to Yul/Assembly code.

The source code can be found here. But the key takeaways I would like to mention in this blog post are:

Don’t use Magic Numbers - Use Constants

Don’t just throw magic numbers like 100 in the code. Use constants instead. Also, don’t just use state variables without making them constant, because you are wasting storage space.

This is a good example:

uint32 public constant TIER1_MAX_BUYS = 100;

Just keep in mind though, that, unfortunately, constants you declare at the Solidity level, can’t be used at the assembly level. This means that you might need to assign the constant to a local variable before being able to access it inside an assembly block.

Immutable

If something takes a value inside the constructor and never changes again, declare it as immutable. It saves storage space.

Example:

uint256 public immutable TIER1_AMOUNT;

Slot Numbers, Offsets and Sizes as Private Constants

When I wrote Yul/Assembly I had to reference slot numbers, offsets and sizes of state variables. If you don’t use private constants to refer to them it will be difficult to update your code should you decide to further optimize the storage. Also, it makes it easier to read when you use constants. It is another case to avoid magic numbers.

Example:

uint8 private constant STORAGE_TIER1_BOUGHT_SLOT = 3;
uint8 private constant STORAGE_TIER1_BOUGHT_OFFSET = 20;

Removed Dependencies to OpenZeppelin Ownable and ReentrancyGuard

Implementation of ownership and reentrancy guard has been moved inside the contract code. This allowed us to write the functions inherited in our own assembly implementation alongside with other optimizations.

Removed Dependency to Math

This library was not necessary. I removed it.

Removed require Calls

I removed all the require calls. I replaced them with revert calls and custom errors.

Non-Slot Aligned Variable Update Is Hard

Reducing the slots in the storage has a disadvantage when you write assembly. It requires a lot of careful bit operations to change the value of a variable which does not have its own dedicated slot.

Take for example the saleActive bool which is in the middle of slot 5. It occupies 1 byte at offset 20.

If I want to set it to true, I have to write this assembly code:

assembly {
    let slotSaleActiveValue := sload(STORAGE_SALE_ACTIVE_SLOT)
    let offsetBits := mul(STORAGE_SALE_ACTIVE_OFFSET, 8)
    let zeroMask := not(shl(offsetBits, 0xFF))
    let setMask := shl(offsetBits, 0x01)
    sstore(STORAGE_SALE_ACTIVE_SLOT, or(and(slotSaleActiveValue, zeroMask), setMask))
}

if saleActive occupied an entire slot, then it would have only be a case of calling just the sstore() statement. But with the saleActive being in between other variable values in slot 5, I need to make sure that I only update the specific byte at offset 20 and leave the rest as it was.

Revert With Custom Code in Assembly

When reverting with a custom code in assembly here is how to do it:

  1. save the free memory pointer to use it as the memory point to store revert information.
let freeMemoryPointer := mload(0x40)
let initialFreeMemoryPointer := freeMemoryPointer
  1. Assuming that I want to revert with the custom error OwnableUnauthorizedAccount(address) , I have to store the error identifier:
mstore(freeMemoryPointer, shl(mul(28, 8), 0x118cdaa7))
freeMemoryPointer := add(freeMemoryPointer, 4)

Note: How do I find the identifier? Various different methods. You can use the cast utility that is coming with foundry:

$ cast keccak 'OwnableUnauthorizedAccount(address)'
0x118cdaa7a341953d1887a2245fd6665d741c67c8c50581daa59e1d03373fa188

… and take the first 4 bytes (0x118cdaa7).

The identifier, 0x118cdaa7 needs to be left aligned before being stored in the memory at the free memory pointer. That’s why I do shl(mul(28, 8), 0x118cdaa7)). I shift left by 28 bits (which is the result of 32, the total size of bits of the memory slot, minus 4, which is the side of the identifier).

Which makes the memory position looking something like this:

Memory with 0x118cdaa7 stored left aligned
Memory with 0x118cdaa7 stored left aligned

Note: After I set a value to the memory, I then update the free memory pointer value to point to the next position available for writing to memory. That’s the second command you see in the snippet above: freeMemoryPointer := add(freeMemoryPointer, 4). I move the free memory pointer by 4 which is the number of bytes occupied by the identifier.

  1. Then I write the run-time argument values to the custom error. In our case, I only have one argument, the msg.sender, an argument of type address.

In assembly the msg.sender is accessed with the Yul function caller().

The address type in Ethereum is 20 bytes long, but, in Yul/assembly it is always right aligned in 32 bytes with leading 0s. This is because the only available type in Yul/assembly is u256 i.e. a 32 bytes long unsigned integer.

Hence, I am not going to write 20 bytes into the memory. I am going to write 32 bytes.

mstore(freeMemoryPointer, caller())
freeMemoryPointer := add(freeMemoryPointer, 32)

This means that both identifier and caller address will be stored in memory like this:

Memory storing identifier and caller address
Memory storing identifier and caller address

Then, as part of a good practice, I save the new free memory pointer value:

mstore(0x40, freeMemoryPointer)
  1. Reverting

Finally, reverting is a matter of calling the revert function with the memory pointer holding the revert data, i.e. the identifier and the caller address:

revert(initialFreeMemoryPointer, 36)

Reverting with custom error that takes more than one run-time argument values follows the same technique and rules.

Emitting Events

Emitting events in assembly, is a matter of using the correct logX() function.

logX() functions
logX() functions

I will show you an example for the event OwnershipTransferred(address,address).

This event is actually declared as:

event OwnershipTransferred(
    address indexed previousOwner,
    address indexed newOwner
);

This is a 3 topics event, because the signature of the event is always the first topic

  • OwnershipTransferred(address,address)

  • previousOwner

  • newOwner

with no extra data.

Hence, I am using the log3 function.

log3(
  0x00,
  0x00,
  0x8be0079c531659141344cd1fd0a4f28419497f9722a3daafe3b4186f6b6457e0, 
  oldOwner,
  newOwner
)

How do I get the signature of the event? Again I can use different methods. Like cast or forge. Let’s use forge here:

$ forge inspect --pretty TalentCommunitySale events
{
  "OwnershipTransferred(address,address)": "0x8be0079c531659141344cd1fd0a4f28419497f9722a3daafe3b4186f6b6457e0",
  ...
}

Accessing Dynamic Variables - mapping

mapping type variables do not store their payload at the position in storage at which they are declared. They occupy 1 slot in storage and they set the value of it to be 0x0.

The actual payload is stored in storage at positions which are dynamically determined.

I will explain this with the example of the variable:

mapping(address => bool) public listOfBuyers;  

The storage layout shows that it is stored on slot number 2:

Only 1 slot for listOfBuyers
Only 1 slot for listOfBuyers

But if I deploy the contract and check with cast storage <contract-address> 2 I will see the value of this slot being 0x0:

0x0 value for slot 2
0x0 value for slot 2

But, if I have stored the true for the address 0x324e9E13dd19528D0F390201923d17c4B7E94462, then this is stored at the position:

keccak256(0x324e9E13dd19528D0F390201923d17c4B7E94462,2)

The keccak256() assembly function returns a 32 byte number which is the slot number in the storage for the given address (0x324…) and slot number 2. To be precise, the arguments to the Yul/assembly function are pointing to the memory. Hence, I first have to store the address and slot number into the memory.

Here is the assembly code that accesses a mapping using this method. I am checking whether the caller() is in the listOfBuyers by accessing the listOfBuyers[caller()].

let buyerAddress := caller()
let freeMemoryPointer := mload(0x40)
let initialFreeMemoryPointer := freeMemoryPointer

mstore(freeMemoryPointer, buyerAddress) // 32 bytes
freeMemoryPointer := add(freeMemoryPointer, 32)

mstore(freeMemoryPointer, STORAGE_LIST_OF_BUYERS_SLOT)
freeMemoryPointer := add(freeMemoryPointer, 32)
mstore(0x40, freeMemoryPointer)

let listOfBuyersSlotForBuyer := keccak256(initialFreeMemoryPointer, 64)

sload(listOfBuyersSlotForBuyer)
  1. I save the address to memory mstore(freeMemoryPointer, buyerAddress)

  2. I save the slot number mstore(freeMemoryPointer, STORAGE_LIST_OF_BUYERS_SLOT)

  3. I calculate the keccak256() 32 number by sending the 64 bytes of memory to the keccak256() function.

Conclusion

The whole exercise was really useful to me to learn Yul/Assembly and how EVM works. I understand that I have a lot more to learn. I will keep on studying and working on this.

What I Do?

I am a happy software engineer, working for Talent Protocol, but also for personal projects like a Web-2-SMS platform. I also like to play tennis and the piano. I read a lot of books. I have a lovely family of humans and animals.

Subscribe to Panos Matsinopoulos
Receive the latest updates directly to your inbox.
Mint this entry as an NFT to add it to your collection.
Verification
This entry has been permanently stored onchain and signed by its creator.