While starting to write smart contracts, this article really helped me learn so much about smart contracts as it explained what each line of code of Uniswap V2’s contract did.
Following the same pattern, the motive of this article is to share my learning on YUL and in the process, make the reader learn more about EVM.
The code we’ll be looking over is this
alert(“ SSTORE2
is not a new opcode, its a more gas effective method of storing information on Ethereum “ )
Quoting exactly what 0xsequence’s repo says about SSTORE2 :
”SLOAD2 is a set of Solidity libraries for writing and reading contract storage paying a fraction of the cost, it uses contract code as storage, writing data takes the form of contract creations and reading, data use EXTCODECOPY
”
So what this means is that instead of storing data in the contract storage, which is the more conventional and costly way, SSTORE2 allows us to pass data
as a contract’s bytecode using the CREATE
opcode and read the data through EXTCODECOPY
.
SO BRILLIANT!!!!
You can read more about the gas costs savings here, lets drive straight to the code
We will be looking over SSTORE’s implementation by Solady. { Check the code }
Let’s get started with the code.
Line number 36 contains our most valuable function:
function write(bytes memory data)
Let’s dive straight into the assembly block :
let originalDataLength := mload(data)
The input to the function “data”
is of the type bytes
which as per the solidity docs is of dynamic
type. A dynamic type is encoded as follows : the first 32 bytes
contains the length of the parameter and the next subsequent bytes contains the actual data
So what mload(data)
does is that it passes on the pointer where data is stored as an offset
to mload. mload then assigns the next 32 bytes
after this offset (length of the dyanmic type bytes) to the variable originalDataLength.
// Add 1 to data size since we are prefixing it with a STOP opcode.
let dataSize := add(originalDataLength, DATA_OFFSET)
The passed in parameter data
is going to act as runtime bytecode
for the new contract. Since this bytecode doesn’t expose any functionalities, it does make sense to prefix ‘00 aka STOP opcode`
before our data to ensure it can not be called. Therefore here we add 1 byte
to the bytecode size to account for this newly added instruction.
mstore(
data,
or(
0x61000080600a3d393df300,
// Left shift `dataSize` by 64 so that it lines up with the
// 0000 after PUSH2
shl(0x40, dataSize)
)
)
The first parameter of or() is what I call a template initCode
with 1 byte space left to include size of data
(after 61:PUSH2
).
Refer to the comments from Line 46-60, it explains what each instruction does.
But then the question arises, why do we need this at all?
The answer that I found after talking to vectorized.eth was that :
When a new contract is deployed using CREATE
, it creates a new context
. In this context, it runs this initcode
which is responsible for executing constructor logic
and returns the runtime-bytecode at the end. The RETURN
opcode is doing exactly that.
Prior to this mstore() command, we know that at memory offset data, length of the data was stored.
But we already stored it in originalDataSize
, so we don’t need it now. So what we do now is overwriting this length of bytecode with initCode.
Now the memory layout looks something like this{ init code } { “00” + data }
Together, this makes the creationCode
that will be passed on to create()
// Deploy a new contract with the generated creation code.
pointer := create(0, add(data, 0x15), add(dataSize, 0xa))
Next we deploy a new contract with the CREATE
command. Do pay attention to the offsets that are being passed to it. Let’s take a look at it.
*create( value in wei, memory offset in bytes where creationCode resides, size of CC )
*- 0 wei.
- Why are we adding 0x15 aka 21
to the data offset?
Take a closer look at 61000080600a3d393df300
. It’s 11 bytes long !!!! Every operation on memory occurs at a multiple of 32 bytes. That is, every element is padded to 32 bytes and then operated upon. The same occurs when mstore() command of Line number 61 is called. Memory at offset data
looks like this :
00000000000000000000000000000000000000000061YYYY80600a3d393df300
Our creation code starts after 21bytes i.e. at memory offset data+0x15
- Third parameter being size of creation code is straightforward :
11 bytes + dataSize
mstore(data,originalDataSize)
- Here we replace the initCode
at offset data
with its originalDataSize
The reason (thanks to vectorized.eth) being :
Users then would need to be careful while again using data
after passing this tofunction, as the first 32 bytes now don’t represent length of the dynamic type.
----------------------------------HALF-WAY DONE-------------------------------------
Next comes the better-half of function write() :
function read(address pointer)
The function responsible for retrieving our data which we stored ataddress pointer
// If `pointer` is zero, revert.
let pointerCodeSize := extcodesize(pointer)
if iszero(pointerCodeSize) {
mstore(0x00,0x30116425)
revert(0x1c, 0x04)
}
What this does is that, it retrieves the runtime Bytecode Size using extcodesize
, stores it in variable pointerCodeSize
and asserts an iszero()
check to ascertain whether the given address actually contains some code. We can learn from the iszero()
function on how to revert
using custom errors in YUL.0x30116425
is the function selector of the error InvalidPointer()
We now store this selector of size 4bytes
at the slot 0x00
.It is padded to 32 bytes courtesy of how EVM works. So an offset to retrieve it will be 0x00 + 28 = 0x1c
which then goes to the first parameter of the revert() command.
mstore(data, originalDataLength)
“You know what this means brother - Thor to Loki”
let size := sub(pointerCodesize, DATA_OFFSET)
data := mload(0x40)
Remember we prefixed our data with a STOP opcode, right?
We have got to account for that and update the actual size of data.
0x40
stores the free memory pointer
for us to play with. The mload instruction assigns the free memory pointer
to data
mstore(0x40, add(data, and(add(size, 0x3f), 0xffe0)))
This is my favourite YUL line of code so far.
- We would be storing our data at the free memory pointer
obtained from above.
- We need to update
the free memory pointer pointing it to a location in memory which is a fresh 32 bytes slot after our data
has taken all the required slots.
- If size(data) % 32 == 0
, we can just linearly add update 0x40
todata + 32 + size
but life’s ain’t ideal
- What the second parameter does is take into account how many bytes our data
is leaving out at the end out of 32 bytes slots it is given to.
It is basically data + 32 + size + (32-size%32)
when size%32 !=0
Check out Line number 88 of Solady’s implementation for more clearity.
mstore(data, size)
mstore(add(add(data, 0x20), size), 0) // Zeroize the last slot.
We are returning bytes data
which being a dynamic data type, we need to adhere to the encoding norms. And assign 1st 32 bytes as size
.
extcodecopy(pointer, add(data, 0x20), DATA_OFFSET, size)
One trick I have realized upon seeing YUL code is to always refer to evm.codes and
see what each params mean. In this case, it is :
extcodecopy(address, where in memory to store, what offset in bytecode to copy,size )
Sure you can figure this one out.
Do look at Solmate’s
repo. The initCode
implementation is somewhat different from Solady’s
Ethereum’s a piece of art. EVM’s a piece of art. (Solady, Solmate) is a piece of art. Hope the msg.reader
learnt something this msg.data
. If there’s some error, I do apologize. Please do correct me.
Moreover, it was an amazing learning experience for me looking at YUL all day and trying to gather meaning out of it. I plan to write more articles like this, now that I like it a lot. I’m always looking to collaborate and work on project, giving my best efforts in writing and learning smart contracts.
For more, do follow on :
Twitter : @proxima424
Github : proxima424
----------------------------------REFERENCES----------------------------------
Thanks to @vectorized.eth for clearing some of my doubts.
Here are some of the references I used :
Solady’s SSTORE2
Solmate SSTORE2
0xsequence’s SSTORE2
Saw-mon-and-Natalie’s SSTORE2
evm.codes
Also to get clearity between runtimebytecode, initcode, creationCode: