Belisarius and the Horde Chapter 2: The Contract and the Calldata

Trying to Disambiguate Belisarius’s Obscure Functions

For an introduction to this series, see here. For quick reference, Belisarius is the nickname I rather arbitrarily gave to the address 0xa57bd00134b2850b2a1c55860c9e9ea100fdd6cf on Ethereum mainnet, you can find the origin story for the nickname in the article in front of you.

Some links that might be useful: Belisarius decompiled, Etherscan

Art

This article’s art was created by MewnCat, based on the concept of bytecode engineering. It’s fittingly titled Bytecode Engineer. The original is animated, but Mirror currently only allows setting a JPG or PNG as the cover art. I’ve uploaded the original to Arweave here:

I’ve also embedded it below. I’m not able to get looping working for right now, I might update if I do. (Thanks for Mirror for the help getting this embedded, if anyone else is experiencing issues embedding mp4 on Mirror, try adding ?display=iframe to the url - if there are already parameters at the end of the url following ?, you can put display=iframe after the ?, and add a & afterwards).

As with the other articles, this article can be minted as an NFT.

Preface

In the last chapter we dove into the easier end of decompiling Belisarius. There are two functions we didn’t dive into there, namely the two functions that didn’t have a publicly known selector through 4byte: 0x78e111f6 and 0xa90e8731. These will be a bit more tricky just because we won’t have the easy cheat of finding what contract they’re from. (We were able to find all of the publicly known selectors in DappSys’s contracts in the last chapter.)

I started with 0x78e111f6, which ended up sending me down a massive rabbit hole.

0x78e111f6

Interestingly enough, if you look through the last few thousand transactions Belisarius has made (hey, don’t judge, it at least strikes me as probably being more healthy than doomscrolling Twitter), you’ll only find calls to the execute function (the execute(address,bytes) version) and the occasional call to 0x78111f6. Emboldened by my success finding execute I searched for 0x78111f6 on Github (which was how I originally placed execute(bytes,bytes)), and while I didn’t find anything on Github, a broader search led me to a thread written by 0xcacti that actually analyzes Belisarius’s use of this function - apparently they were using it for JIT sandwiches at the time the thread was written.

Analyzing Etherscan, it looks like all 0x78111f6 is used for manipulating Uniswap v3 positions. It would be interesting to analyze historically if that’s all this function has ever been used for, but we haven’t done that yet.

I wasn’t 100% sure where to start, so I ended up trying a few different things. I kept on looking at the decompilation, and was somewhat confident that the first argument was address, and was willing to guess that the next was bytes just by eyeballing the validation. That, and the fact that the body of the function seemed based around a delegatecall made me willing to guess that this was some variation of execute(address,bytes), maybe with an additional argument. I poked around GitHub, particularly around DappSys’s contracts, but didn’t find anything.

While I did keep on looking at the decompilation, and eventually was able to more or less decipher it, another idea I pursued was looking to see if maybe there was another contract with this same function that had verified their code on Etherscan or Sourcify. I am not certain how 4byte works, and it stands to reason to me that it could be possible for there to be verified contracts on either platform that have still not had their selectors uploaded to 4byte. In addition, someone once pointed out another contract to me that they said was “related” to Belisarius. This contract bore numerous similarities to Belisarius, including a 0x78e111f6 function, which made me theorize that there were probably more. I figured if I could find something this way it would be easier and more accurate than pulling through the decompilation, so I went started figuring out how to find this.

Discovering the Horde

By now we already had an archive node set up with Trueblocks and evm-trace, something we’ll dedicate the next chapter to, so my first thought was to use them, but after some pings on some Discords, the best advice I got was actually to use Google’s Ethereum BigQuery set for this.

It has been ages since I wrote SQL, which is the language used for querying, but on the bright side, the dataset provides a set of function selectors, which meant I could query it directly to discover every 0x78e111f6 contract. After an amazingly frustrating hour or two I managed to put this together:

SELECT address FROM `bigquery-public-data.crypto_ethereum.contracts` WHERE CONTAINS_SUBSTR(function_sighashes,78e111f6’) AND address !=0xa57bd00134b2850b2a1c55860c9e9ea100fdd6cfLIMIT 1000

I’ll put this here just in case it helps anyone else trying to familiarize themselves with querying the dataset, and simultaneously in case someone who actually knows what they’re doing can point out a better way to do so.

I had been worried about getting a huge wave of results, so I put in to limit the query to returning 1000 values, but I needn’t have been worries, as it only yielded me three addresses:

[{
 "address": "0xfa103c21ea2df71dfb92b0652f8b1d795e51cdef"
}, {
 "address": "0x9799b475dec92bd99bbdd943013325c36157f383"
}, {
 "address": "0x434f179f3f28e2e4476b62562ebe3e2db26b88d7"
}]

The Etherscan links for these three contracts, in order are: 1, 2, 3. Number 2 is marked "Fund" by Etherscan, for some reason. None of the three verified their source code. A casual glance sees that two of the three (1 and 2) are prolific (>100k txs), and that they call a function execute quite a bit, which makes one suspect that they are built of very similar code to Belisarius, and will likely be useful to know about.

After thinking about this for a bit more, something struck me as fishy here. I mentioned before that someone had pointed out another contract similar to Belisarius, and that I had seen that it also had a 0x78e111f6 function, yet it wasn’t on the list. I went back to BigQuery and saw that I could search all contract bytecode on mainnet. I was worried that it could be possible for the selector to coincidentally appear in bytecode for unrelated reasons, and also because searching the entirety of the contract bytecode would eat up roughly half of the standard BigQuery free trial, but I figured I’d yolo it. It’s pretty much the same query, but I figure I’ll once again put it here to help those newer than me, and to potentially get help from those better versed:

SELECT address FROM `bigquery-public-data.crypto_ethereum.contracts` WHERE CONTAINS_SUBSTR(bytecode, '78e111f6') AND address != '0xa57bd00134b2850b2a1c55860c9e9ea100fdd6cf' LIMIT 10000

This gave me an additional 8 addresses (and the three above, I’ve filtered those out by hand):

[{
 "address": "0xb00be4839ec2e8ce36959fd41fe37d73d11bebfb"
}, {
 "address": "0xb00b67377f795b4b4b845a6294ba89477e13ace8"
}, {
 "address": "0x5050e08626c499411b5d0e0b5af0e83d3fd82edf"
}, {
 "address": "0xc5ecee6d94f72b6c456e3032f383284cca36b70a"
}, {
 "address": "0xf2b245f40ed1e8fe30a0d7679535a30d52c5b333"
}, {
 "address": "0x4cb18386e5d1f34dc6eea834bf3534a970a3f8e7"
}, {
 "address": "0xa69babef1ca67a37ffaf7a485dfff3382056e78c"
}, {
 "address": "0xa69babe15c6f4fd07270173ea1d7a19b8c3f8c0b"
}]

These results do include the contract I’d been tipped off about. This does raise the question why they weren’t in the first dataset. My suspicion is something like this: I’d assume that the dataset parses function selectors both by using public databases like 4byte and also by parsing calldata to contracts (since the first four bytes are generally selectors). This means that in the event that a contract has functions that don’t appear in the public databases (like 0x78e111f6 as of this writing) that perhaps Google doesn’t “see” the function unless it’s been called. That would raise a suspicion that maybe 0x78e111f6 has only been transacted with in the top 3 (and Belisarius), but not in the other eight. (Or, more pedantically, hasn't been transacted with directly, since I'd guess they're scraping functions from the first four bytes of calldata. The difference would be that they could still theoretically have been interacted with, just only indirectly, like if they were only called by a multisig, where the transaction would be from the EOA to the multisig, or otherwise in the middle of another transaction, through multicall or weiroll, etc.) I haven’t tried validating that yet, though.

There’s a bunch to look at with these addresses, but the most urgent order of business was seeing if any of them verified their source code. They didn’t.

This discovery, however, did open up a whole new line of research. There are some clear connections between a number of these contracts. One low-hanging point of interest is that many of them (including Belisarius and contracts 1 and 2 above) share a deployer: 0x8641df2d7c730a8a24db86693fc39f7a74dd4e9d. It was around now that I decided I should start a naming convention for addresses and entities of interest. Bored of Alice, Bob, and friends, I decided to use a similar alphabetical naming scheme, but somehow taking warrior people from the annals of history sounded better, so henceforth 0x86 became known as Artemisia. I realized that if I was naming the ubiquitous deployer that I should also be giving our own focal point, 0xa57, a name, and Belisarius was the standout for ‘b’, which is the rather anticlimactic story of why I’ve been calling our contract Belisarius this whole time. (Though, as a side point, if you don’t know who Belisarius is, I highly recommend the chapter about him in Victor Davis Hanson’s The Savior Generals, and while you’re at it, give Henry Wadsworth Longfellow’s poem a read.) Seeing as Artemisia had deployed a number of these contracts, I also checked Trueblocks to see what other contracts Artemisia deployed, I believe there were a total in the thirties. I’ve loosely begun referring to both the 0x78e111f6-bearing contracts and all the other contracts Artemisia deployed as The Horde, and it is my assumption that they will all be useful in research down the line, but that will have to be a rabbit hole for another day.

Trying to research these addresses caused me to learn something interesting: as a part of researching an address, just drop the address in a regular search engine. You may find research that mentions it. Just as an example, the contract Etherscan labeled “Fund” (contract 2 above) has a bit of an online presence:

On the other hand, using these addresses for researching 0x78e111f6 had officially hit a dead end.

Calldata Analysis

I thought it might be easier to at least figure out the arguments of the function by taking a look at the calldata 0x78e111f6 was called with. I started with 0xb52668345b575b2baedd2801d13b6bac25fc594ec7e8ed1776f47d1200e3ebb9.

Here’s what the raw calldata from that tx looks like:

0x78e111f6000000000000000000000000b6fb3a8181bf42a89e61420604033439d11a09530000000000000000000000000000000000000000000000000000000000000040000000000000000000000000000000000000000000000000000000000000018433ef3e6a0000000000000000000000000000000000000000000000000000000062fb6ed900000000000000000000000000000000000000000000000000fe091352dc80f0000000000000000000000000000000000000000000000000000000000000007f00000000000000000000000056178a0d5f301baf6cf3e1cd53d9863437345bf9000000000000000000000000c02aaa39b223fe8d0a0e5c4f27ead9083c756cc2000000000000000000000000a0b86991c6218b36c1d19d4a2e9eb0ce3606eb4800000000000000000000000000000000000000001857a47ca29a46000000000000000000000000000000000000000000000000001d14a0219e548200000000000000000000000000000000000000000000000000000000000000000070ccc85b0000000000000000000000000000000000000000000000000000000070ccc85b0000000000000000000000000000000000000000000000000de0b6b3a764000000000000000000000000000000000000000000000000009d97172d0297c0000000000000000000000000000000000000000000000000000000000000

Normally I like explaining different ideas that I tried that didn’t work, but I’m going to need to break tradition here. It took me a lot of tries to figure out the best way to represent this data. I’m not sure I got all of it correctly as it stands, but I’ll walk through how I got at least as far as I did.

The first step is that the first 4 bytes (the first 8 characters after the 0x, since the EVM uses 32-byte words) is the function selector of the function being called. After that, we’ll separate the next three words (32 bytes) each onto it’s own line.

0x78e111f6
    000000000000000000000000b6fb3a8181bf42a89e61420604033439d11a0953
    0000000000000000000000000000000000000000000000000000000000000040
    0000000000000000000000000000000000000000000000000000000000000184

Looking at the delegatecalls made in the transaction, we can see that the first word is the address of the strategy contract being called by Belisarius. This is more or less confirmation that the 0x78e111f6’s first argument is address. The others are simple hexadecimal values, hard to tell what they are quite yet - though the first, 0x40, is 64 when translated into decimal, which in the world of powers of 2 (and 32-byte words), is a suspiciously convenient number, which I think makes it fair to have a suspicion that it’s a number (as opposed to something like a string or other datatype).

While splitting into 32-byte lines is a simple tactic, it’s surprisingly useful. If you’d just eyeball the calldata, it’s easy to miss that lone 4 in a sea of zeroes, let alone see that it’s actually 40 instead of 4. In addition, it would be easy to think that the 184 keeps on going, since it’s followed by nonzero hexes (33ef3e6a) too. That itself is a bit interesting, though - as you can see in the first lines, usually arguments are left-padded, meaning that for example, the address gets enough zeroes added to the beginning of the line to pad the address to 32 bytes. If that’s the case, how come the fourth line starts with 33ef3e6a and then has zeros after it (right-padded)? On the other hand, looking at 33ef3e6a, it looks like a function selector itself, and looking at the strategy contract’s decompilation, we can see that it has a function with that selector. If we make a leap of faith for a second and say that the calldata starting 33ef3e6a starts a second call nested in the first, a lot of the calldata falls into place. First, let’s break it up:

[1]  0x78e111f6
[2]      000000000000000000000000b6fb3a8181bf42a89e61420604033439d11a0953
[3]      0000000000000000000000000000000000000000000000000000000000000040
[4]      0000000000000000000000000000000000000000000000000000000000000184
[5]      33ef3e6a
[6]          0000000000000000000000000000000000000000000000000000000062fb6ed9
[7]          00000000000000000000000000000000000000000000000000fe091352dc80f0
[8]          000000000000000000000000000000000000000000000000000000000000007f
[9]          00000000000000000000000056178a0d5f301baf6cf3e1cd53d9863437345bf9
[10]         000000000000000000000000c02aaa39b223fe8d0a0e5c4f27ead9083c756cc2
[11]         000000000000000000000000a0b86991c6218b36c1d19d4a2e9eb0ce3606eb48
[12]         00000000000000000000000000000000000000001857a47ca29a460000000000
[13]         00000000000000000000000000000000000000001d14a0219e54820000000000
[14]         0000000000000000000000000000000000000000000000000000000070ccc85b
[15]         0000000000000000000000000000000000000000000000000000000070ccc85b
[16]         0000000000000000000000000000000000000000000000000de0b6b3a7640000
[17]         00000000000000000000000000000000000000000000009d97172d0297c00000
[18]         00000000000000000000000000000000000000000000000000000000

(You may need to copy-paste that into a text editor, depending on how usefully it’s formatted by Mirror.)

There is something that may look a little bit weird here, namely that the calldata does not end neatly at the end of its line. I plan on getting back to that later. By breaking the calldata up like this there are some things we can get even from a first glance - lines 9-11 look like addresses, and it’s very reasonable to assume that they are, being that the first is the address that received the funds from the transaction (and seems to be a common recipient of Belisarius’s transactions in general, making it likely a member of The Horde), followed by the addresses of the two tokens comprising the Uniswap pool being interacted with: WETH and USDC. Lines 6 and the repeated value in lines 14-15 are the right size to be a function selector, but I did not find a matching selector in the databases nor in the strategy contract being delegatecalled by Belisarius in the transaction.

This does leave the question of why and how there’s a call nested in the call, but if you think about it for a minute, the truth is that this closely resembles what we know of execute(address,bytes). If we’re willing to continue with that thought process, we would expect the payload to call to the strategy contract with to be a bytes array (bytes in Solidity). The payload starts with the function selector, in this case 33ef3e6a, so that leaves us to figure out what lines 2 (0x40) and 3 (0x184) are.

I’ll skip ahead to the answer here, which is something you likely have already figured out if you’re familiar with array encoding. Starting at the beginning, bytes is a kind of array under the hood. It stores an arbitrary amount of, well, bytes. How does the EVM know what is a member of an array, and what is a value outside of it? For example, let’s say you’re making a call to a function that takes (bytes32[], bytes32) as arguments, how would the EVM know if a particular 32-byte word is in the array or the argument afterwards? The answer is that there are two additional pieces of information that get encoded with an array: the offset and the number of elements. I’m not sure I understood the spec entirely, so I’ll explain as best as I can, and please dyor to see if I’m right or not. The offset simply tells us how far into the calldata or bytecode to look to see where the number of elements are (which will then be followed by the actual data). (Even though we’re talking about an array with a potentially dynamic length in the example - bytes32[] - remember, calldata is immutable, so when the call is being made it’s treated like a fixed-length array. If you’re really interested in knowing how dynamic-length arrays are encoded, that’s here in the spec.)

Looking at our calldata, line 2’s value is 0x40 in hex, or 64 in decimal. If this is an offset, it would mean the number of elements should start 64 bytes in, or in other words, at the beginning of line 3. Line 3’s value is 0x184, or 388 in decimal. If you count that out (starting from the beginning of line 5), that would take us precisely to the end of line 17, which would seem convenient enough to basically serve as confirmation that we’re correct in our assumption and the way we broke things up above.

It also means that at least as far as we can tell, 0x78e111f6’s args are (address,bytes), which would mean that it is not named execute, since we’ve already got an execute(bytes,bytes). I did try a number of variants using Cast, like execute0, execute_, and x, but to no avail.

Having a stronger idea of what the arguments were, I felt it was time to look at the decompilation more closely.

Before we do that, let’s try to clear up one last potential mystery. Doesn’t it seem strange that the calldata on line 18 doesn’t have 64 characters? (In case you don’t feel like counting, it’s 56 zeros, or 28 bytes.) Isn’t the encoding always in 32-byte words? I fell down a bit of a rabbit hole and discovered that some MEV bots apparently add some non-ABI-encoded data in their calls, as it can still be accessed through msg.data, but there’s an even simpler answer. If you look at the way we broke things up before, both line 1 and 5 have only 4 bytes. Line 1 is the function selector for the call, so it really is an exception to the 32-byte word rule, but line 5, even though it’ll be used as a function selector in another call, is just a part of a bytes array. The bytes array itself is not cleanly divisible by 32 bytes, so it gets some additional padding. If you look at the calldata as a whole (well, excluding line 1), the entirety is indeed divisible by 32.

Back to the Bytecode

Let’s go through EtherVM’s decompilation of 0x78e111f6 more thoroughly. Just looking through the code, we can see the decompilation of the function is divided into three parts: the initial dispatch, func_0302, and func_088F.

the decompilation continues after this, but the rest isn't relevant for us afaict
the decompilation continues after this, but the rest isn't relevant for us afaict

The initial dispatch first gets the size of the calldata minus the selector by taking the length of the calldata and subtracting 4 from it (for the selector). If the calldata is less than 64 bytes, the call reverts. Now that we know that there are two arguments to the function, this makes sense - 64 bytes is the minimum size for two arguments (a 32-byte word for each). It then jumps to what it calls func_0302

func_0302
func_0302

If you look at the image above, you’ll see that arg0 at the time this function is called is 4, and arg1 is the size of the calldata minus 4 (so the length of the data without the selector). If I’m reading correctly, the first if should be checking the line with the offset for the bytes array. I assume that the check is based on some part of the ABI spec (probably the max size for calldata), but am too lazy to find out. If that is the case, the next if makes sure that the offset is somewhere inside the calldata; if the offset is a higher number than the length of the calldata, something is wrong, and it just reverts.

Assuming we’re still reading right, temp3 should be the word the offset points to, which should be the number of bytes in the bytes argument. I’m assuming that 0x100000000 is still the upper bound on calldata size, and if the number of bytes the calldata says is in the array is bigger than that, the validation fails and it reverts. temp5 would then be the same thing, which is then used to pull the data of the bytes array into memory. Then func_088F is called with the address as the first arg. The second arg, as far as I can tell, is actually just 32 bytes of zeroes, since when that variable is set by reading from memory there shouldn’t be any memory stored there, but I’m not sure about that.

(One thing that might be becoming clear is that either I’m not very good at reading decompilation, or that the compiler seems to do a number of either redundant or otherwise weird operations. I’m feeling like I might have finally nerded out enough to appreciate this famous classic a bit more, though nowhere near that level of proficiency.)

Looking back, that would make this block largely data type validation, and some preparatory memory operations. I’ve started noticing that this seems to be a general rule with EtherVM; first, in the dispatch, all it does is check that there’s enough data to support the number of arguments. Then it jumps, and that jump is to some basic data type validation and memory prep, then it jumps again and that’s where the main logic is. I don’t think that this is an ironclad rule, but you might find it useful in your own journeys.

I can't fit the whole delegatecall in the screenshot
I can't fit the whole delegatecall in the screenshot

The first if in func_088F checks if msg.sender is a ward or not, and reverts if not. After that we hop into a delegatecall. Both call and delegatecall return two variables - a bool indicating if the call was successful or not, and the data returned by the call. A variable is set to store the bool (so the function can revert if the delegatecall wasn’t successful), but doesn’t bother committing the data to memory in this line - it’ll use returndata for that.

Since the delegatecall didn’t fit in the screenshot, here it is:

temp1, memory[0x00:0x00] = address(arg0).delegatecall.gas(msg.gas)(memory[temp0 + 0x20:temp0 + 0x20 + memory[temp0:temp0 + 0x20]]);

That should be more or less what we expect. It delegatecalls the address that was passed in, and supplies the bytes arg from memory. The next chunk pulls the returndata into memory, and returns it if the call failed. If the call failed, the returned data is the error message, so this allows the function to return an error message from a failed delegatecall.

(It does seem like there are still some more unsolved mysteries here, since func_088F returns something, but I haven’t quite figured that out, and it doesn’t seem terribly relevant either.)

At this point, we seem to have a fairly good idea of what 0x78e111f6 does - it takes an address and a payload, and delegatecalls the address with the payload. The thing is, that’s exactly what execute(address,bytes) does, no? Why does the contract need both? I went back and started comparing the decompilation of the two functions, and realized that they are almost identical. There’s only one difference (as far as I can tell), namely that 0x78e111f6 bubbles up the error message on a failed delegatecall whereas execute(address,bytes) does not.

I’m fairly certain that this is the difference between them. Out of curiosity, I looked at the code in ds-proxy again, and while the current implementation of execute(address,bytes) does bubble up the errors, once upon a time it did not. So Belisarius’s execute(address,bytes) is actually taken from the older version of ds-proxy, while 0x78e111f6 is the newer. I tried some selector brute forces from that (executeNew, executeWithErr, executeErr, etc), but still to no avail.

Interestingly, like we originally pointed out by way of 0xcacti’s thread, Belisarius uses 0x78e111f6 as the second leg of Uniswap v3 JIT attacks. This implies something, I would suspect the most obvious guess is that there is some special need to be able to see error messages there. I’m not an expert with Uniswap v3 at all, so I don’t know why that would be. One other idea that occurred to me was that it simply might make detecting the attacks harder if they come from two separate functions.

0xa90e8731

That brings us to the last of Belisarius’s functions for us to unravel.

Here I started with the decompilation. I’ve been getting comfortable enough with it that I figured I’d get what details I could out of it before looking for code in DappSys and casting wider nets in Github.

The validation in the initial dispatch looks for 64 bytes of data, so it seems that the function takes two arguments. It then sends off to func_04A7. func_04A7 is quite large, so one of the first things I did was try to find which function it in turn sent off to, since most functions I’ve seen in EtherVM break into three parts, as mentioned above. Interestingly enough, the bottom of the function doesn’t send to another function - it’s a bunch of nested if/else conditions and jump destinations, which at least instinctively looks to me something like a for-loop likely would. Scrolling up, the only line I could find that would jump outside func_04A7 was:

return func_088F(var3, var4);

You may remember func_088F from the previous section - it’s the main logic of 0x78e111f6. This piqued my curiosity. In ds-proxy, execute(bytes,bytes) makes a call to execute(address,bytes) at the end of it. If 0x78e111f6 is the newer version of execute(address,bytes) that includes error bubbling, a good initial hypothesis would be that 0xa90e8731 is a version of execute(bytes,bytes) that calls it.
Looking at the code of execute(bytes,bytes), we see that it calls two other functions, read and write:

target = cache.read(_code);
    if (target == address(0)) {
        // deploy contract & store its address in cache
        target = cache.write(_code);
    }

We should be able to find these selectors (read: 0x8bf4515c, write: 0x7ed0c3b2) in the code of 0xa90e8731 if our hypothesis is correct. Sure enough, they’re both there.

Another quick check would be checking the decompilation of execute(bytes,bytes) against func_04A7. Just eyeballing the two is enough to see that they aren’t the exact same, but are indeed very similar. One particular point of interest is:

func_0603(var2, var3); 
return;

func_0603 is the main logic for execute(address,bytes), so this is the analog to where 0xa90e8731 sends off to func_088F, if our theory is correct. The big difference is that func_04A7 returned the result of func_088F, implying a return value, whereas execute is not returning a value from func_0603, but rather returning the next line. For me, this makes sense, since func_088F can return an error message, but func_0603 will not.

This is all enough for me to conclude that 0xa90e8731 is to 0x78e111f6 as execute(bytes,bytes) is to execute(address,bytes) in Belisarius, at least until proven otherwise.

Well Why Not

The final cherry on top would have been getting the function names for 0x78e111f6 and 0xa90e8731. In the name of Research, I decided I’d reach out to Artemisia and Belisarius on-chain, and see if maybe they’d be willing to fill me in. For those who haven’t heard of it, you can essentially send a message to an address on-chain by sending an otherwise empty transaction to the address and encoding your message in UTF-8 in the data field. (I’m planning on making a different message to Belisarius the topic of an Interlude.)

I used Etherscan’s IDM tool to send the messages. If you’re interested, these are the transactions: Artemisia, Belisarius. If you’d like to follow along in IDM, you can use these links: Artemisia, Belisarius. Even if they send a return message from a different address, it should still show up there.

On a side note, even though contracts don’t have built-in capabilities to make an arbitrary call, which should prohibit messaging if they don’t have a specific function for it, Belisarius should be able to respond. While I suspect that calling execute(address,bytes) with my address and the message UTF-8 encoded as the second argument would revert - I assume it would try delegatecalling into my address, which is an EOA, and fail, there is another way. An operator (or anyone, really) could deploy a contract with a function something like this:

function chat(address target, bytes message) external {
  (bool success, bytes memory data) = target.call(message);
  assert(success, data);
}

Then, Belisarius could execute(address,bytes) into that contract with the call to chat encoded as the second argument. That would delegatecall the chat function, making it look like Belisarius sent the message. That all being said, I wouldn’t expect Belisarius to respond directly. In fact, I’m none to certain that I’ll receive a response at all. But if I do, I expect that it’ll be Artemisia or one of the other operators.

Conclusion

It seems fair to conclude that we have a fair idea of everything that's happening in the Belisarius contract at this point. Something that is of interest is that this seems more or less unnecessary for understanding Belisarius's activities. All you need to know is that Belisarius deploys other contracts with more granular logic and uses those to execute transactions, but the actual details seem largely irrelevant. As a complete dilettante who's just interested in a good time (with my own bizarre definitions of what that means), I still consider it time well spent.

Something that we do gain from having a better understanding of the contract is a better idea of what can be researched. Understanding the `rely`, `deny`, and `wards` functions means a better understanding of where we can look to understand operators of the contract over time. Understanding the various `execute` functions means we can understand that there's a whole realm of research in discovering these strategy contracts and understanding them.

As mentioned before, this roughly completes an analysis of all of the functions on Belisarius. I’ve put together a Gist with an approximation of the code.

We also learnt that there is a whole cadre of Belisarius-like contracts out there.

The next steps will involve needing to be able to scrape data from the chain. For us, that meant setting up an archive node in the cloud. I haven't seen a good resource on that, so decided we'd take a detour from analysis to detail getting that set up, and then we'll get into the data we found, and what it might mean.

I say next steps, but we’ve been working at this already. We have some initial data, I’ve got a fren working on some visualizations, and we’re trying to figure out how to take this further. I’ve also started on a couple of Interludes, pieces that are more narrative around some stories, real or imagines, that have come from this research.

As always, if you're interested in contributing, or otherwise have comments or observations you'd like to add, we'd love to hear from you, anon.

Subscribe to Will Schwab
Receive the latest updates directly to your inbox.
Mint this entry as an NFT to add it to your collection.
Verification
This entry has been permanently stored onchain and signed by its creator.