An in-depth analysis of ERC-7512: onchain representation for sec…

An in-depth analysis of ERC-7512: onchain representation for security reviews. v.1

October 4th, 2023

Reflection

The research represents gathered data and insights from our merged collective research made by Octane Security(Nathan Ostrowski), Rektoff (hyperstructured.greg, Yehor), and turbo-productive feedback from Trustblock(Timur, Sacha) team.

In this analysis, we dive deep into one of the most exciting EIPs in recent memory: EIP-7512, otherwise known as “Onchain Audits”. We’ll start by giving a high-level overview, reflect on its benefits/drawbacks, and discuss areas for improvement. Our goal in this work is to equip the community with the analysis necessary to thoroughly examine and debate the benefits and persistent challenges of this EIP.

Comments and messages are welcome.

Here you can navigate the information by following this structure:

Introduction
- Why do we need this? - 3 fundamental problems.
  - Data Standardization
  - User → Protocol relationship.
  - Data Fragmentation
Reflecting on the challenges of the current EIP:
Auditor Identification
Static Audits, Evolving Security
Contract vs. Protocol-level Audits
Intra-contract Scope
“Been audited” vs. “Passed an audit”
Analyzing the proposed Proxy addition:
- What makes a standard for Proxy/Factory on-chain audit data difficult?
- What do the authors currently propose in their follow-on?
- What’s the difference between deploymentCode and runtimeCode, and why does it matter?
- How can a lack of constructor args get us into trouble?
- Approaches to codifying constructor args
Looking ahead
References

Intro

The idea of “onchain audits” proposal was cooked for more than a year in closed chats and discord servers. While the proposal began with ambitious goals, ultimately what we see in the current EIP is a scaled-back version to maximize the EIP’s chances of being accepted by the wider community.

The perspective should be reversed. Journey should start with user → smart contracts interaction.

Still, crucial security information hidden away in some PDF on some ~~company~~ servers becomes a jungle of misinformation for anyone who wants to verify the project's legitimacy aspect for a protection magnitude.

Daniel, CTO at WorkX.

Why do we need this? - Three fundamental problems.

Standardization

Standards are critical to cybersecurity. At the moment, the Ethereum ecosystem still lacks a standard for security reviews/audits––every auditor uses their own report format, their own definition of severity, and their own checklists and procedures.

While EIP-7512 doesn’t tackle all of these issues, it lays the groundwork for increased standardization within report data as well. It sets out to create a common place for all of this critical data to live on-chain, and sets the stage for future proposals/amendments to align the audit data itself.

In the future, comprehensive data standardization could allow for truly composable security properties. Whether it’s a user interacting with a protocol or a protocol integrating with another, if we implement proper data standardization, we can equip members of the ecosystem with the right tools to confidently make decisions about their own security. With the right standards, there’s real potential to transform the safety of the whole ecosystem.

Increasing trust in User - Protocol interactions

Audit reports are about more than just securing a protocol’s codebase. Frequently, protocols use their audits to increase trust from the whole community. Therefore, the audience of an audit report isn’t just protocol developers, it’s the community at large.

DeFi is also an increasingly composable ecosystem. While a few years ago the average protocol interfaced with just 1 other protocol, today, the average newly created DeFi protocol interacts with 6 others.

These stakeholders––both the community and the DeFi protocols that interface with a given protocol––depend on a reliable and trustworthy place to pull in audit information. Right now, most of this critical security data is spread across github links, auditor websites, protocol websites, and PDF documents. As Sacha at Trustblock points out, it’s paradoxical that a community built on verifiable onchain trust depends so much on sporadically shared web2 links.

This problem has real consequences. Take the example of FEGtoken:

Jack Sanford, the co-founder of Sherlock.xyz highlights a critical issue in the DeFi space, focusing on an exploit involving FEGtoken. According to Jack, FEGtoken had users interact with a different, unaudited contract than the one they claimed was audited. This led to an exploit costing around $2 million. He points out that both the FEGtoken website and DeFiLlama website gave the misleading impression that the protocol was audited, despite the fact that users were interacting with unaudited contracts.

EIP-7512 aims to allow protocols and auditors to solve this issue by linking audit details to the smart contract itself, so there's no room for protocols to misrepresent which contracts are audited. It's a way to make sure that you, the user, interact with the audited version you expect.

Data fragmentation problem

This brings us to the central problem that EIP-7512 aims to solve: data fragmentation - right now, finding audit data is like going on a scavenger hunt. You might find these security reports on a project's website, docs, some on GitHub, and maybe even in a tweet announcing the audit results. It's scattered, disorganized, and frankly, a pain to piece together.

EIP-7512 tackles this head-on by creating a single, reliable source for all audit data - right on the blockchain.

This single location in a decentralized environment eliminates the chaos of data fragmentation. On-chain audit data makes it easier for users to find what they're looking for and instills a greater sense of trust and reliability in the smart contract.

Reflecting on the challenges of the current EIP

First, let’s talk about the Token Bridge example at the heart of the EIP design. This example strikes at (1) what the proposal aspires to be, and (2) what the EIP-7512 could be misused for.

In the above image, the authors describe a hypothetical Token Bridge protocol that incorporates onchain audits into its flow. The Bridge (1) grabs the review’s data, (2) verifies the identity of the auditors involved, and (3) checks that the token adheres to an acceptable standard.

This example shows what the authors imagine EIP-7512 could be: a composable piece of protocol logic and decision-making.

However, this also provides a clear example of how it may be misused. We’re concerned that the process described by the authors puts manual verification of audits in the backseat. There is no step in this process where the Bridge Operator explicitly verifies the contents of the audit. Instead, that critical step to check what the audit actually contains is minimized––it’s a disclaimer at the end of the EIP.

And if a protocol were to directly interpret this diagram, they’d likely miss that disclaimer. Thus, they would ignore key important questions, like:

How much of this protocol was audited? (i.e. scope)
Were all the suggested changes fixed? (i.e. remediation)
If the protocol left some vulnerabilities as “acknowledged”, is the auditor comfortable with that?

So, if this Token Bridge example conveys that onchain audits are meant to be composable if protocols are going to leverage this data to make security decisions, what are the biggest pieces they need to make better-informed decisions?

Here are a few:

Auditor Identification: The current model handles security researchers as monolithic entities with a name, URI, and author list. In practice, multiple specialists often contribute to an audit, sometimes anonymously. Their individual reputations may diverge from the firm's.
1. Our thoughts: Incorporate identity management at both the firm and individual auditor level. Ideally, this would take the form of cryptographic keys, allowing for fine-grained control over who gets to claim credit/responsibility for each report.
2. The current proposal suggests leaving the key management concerns up to auditors and registries, we agree.
Static Audits, Evolving Security: Security is a constantly evolving landscape––new threat vectors and 0-days pop up over time (see: Read-only Reentrancy). What counts as “secure” in 2018 might not count as “secure” in 2023, when new vulnerabilities exist that weren’t in the scope of the original audit.

Our thoughts: We suggest two ways this problem may be mitigated:
1. Auditors may optionally control whether their data has an expiration date, after which it explicitly shouldn’t be used by protocols.
2. Trustblock team suggests that registries or auditors may also include dynamic information––such as new 0-days, protocol upgrades, credible bug bounty reports, or other changes. This information could be leveraged in existing composable projects so that those integrating this protocol may “pause” their integrations if necessary.
Contract vs. Protocol-level Audits: At the moment, EIP-7512 ties a single audit report to a single smart contract address. However, as anyone who has observed cross-contract reentrancy vulnerabilities knows, problematic behavior can arise from interactions between contracts within a protocol. Does this audit pertain to the individual contract? Or does it pertain to the safety of the protocol as a whole, of which this contract is a part?
1. Our thoughts: This could be something like a hash of all the addresses incorporated as part of this review process. Needs further discussion.
2. However, the team from Trustblock points out that it’s important to keep the ease of consuming data in mind. For a protocol seeking to integrate with this one, it’s easiest to directly retrieve the contracts included in this audit scope if they’re directly stored as addresses. This way, the protocol can easily check if for example whether each of the contracts for a given token that seeks to integrate into a DEX is audited.
Intra-Contract Scope: Audits are expensive, and auditors can quickly exhaust their allotted time due to the protocol’s budget constraints. Thus, within a contract, some functions, integrations, or functionality can be left out of scope. In the current proposal, the registry/end user would be responsible for chasing down this critical information. This could lead to a false sense of security.
1. Our thoughts: Either incorporate this into the above scope parameter or make this its own parameter. Could be something like a hash of those functions which have been included as part of this audit, and the registry’s job to identify which functions are thus unaudited. Needs further discussion.
Been audited vs Passed an audit: Security firms understand that auditors can only offer suggestions and remarks; they can't guarantee that the protocol creators will address those issues. While a protocol may have 'been audited,' it may not necessarily have 'passed an audit.
1. Our Thoughts: The current EIP contains a disclaimer at the end, stating that 'this ERC MUST NOT be considered an attestation of a contract’s security.' It frames on-chain audits as 'data relevant to a smart contract' that should still undergo manual review.
2. However, in an industry plagued by hacks, users crave security guarantees. We are concerned that the on-chain nature of these reports may lead some to view them as attestations of security—mainly because no better alternatives exist and 'on-chain' conveys a sense of trust. This false sense of security is further compounded when critical metadata isn't included, such as:
  1. Total scope (”did the auditors review all the contracts in this protocol”?)
  2. Per-contract scope (”did the auditors review each of the functions in this protocol, or did they run out of time/budget?”)
  3. Resolution status (”did the protocol implement all of the suggested fixes? If some findings were left as “acknowledged”, do the auditors still count this protocol as ‘passing’ their review?”)
3. We support leaving Findings standardization to a future implementation––but we believe including the critical metadata will vastly improve community members’ trust in onchain audit representations.

Analyzing the proposed Proxy addition

An in-progress follow-on to the existing EIP outlines the challenges of dealing with proxies.

Proxies are an extremely important and abundant part of the Ethereum ecosystem. To be relevant to the modern Ethereum ecosystem, we believe that any proposal involving on-chain audits must consider how to address proxies.

From a practical perspective, because proxies are so abundant, we believe that if a proposal does not consider how to handle them, the market may demand it anyway. If protocols find that on-chain audits increase their credibility, those that use proxies will be eager to participate. Consequently, auditors will be incentivized to manage on-chain audits for these proxies in some way.

If a standard isn't established, divergent implementations may occur. These could be detrimental to the ecosystem as a whole—compromising the composability that DeFi projects often rely on and potentially misrepresenting the security of a protocol for its users and other composed protocols.

Here, we'll focus primarily on the Diamond pattern, but future work should consider how a standard may apply to all proxy patterns.

What makes a standard for Proxy/Factory on-chain audit data difficult?

~~tl;dr:~~ Each form of proxy may have a different method for loading the implementation contract address.

Because there are many forms of proxy contracts, and not all of them adhere to established standards, it's difficult to define rules that work for all.

However, ignoring all of them doesn't seem like a great solution either. Because there are many forms of proxy contracts, and not all of them adhere to established standards, it's difficult to define rules that work for all.

What do the authors currently propose in their follow-on?

The current follow-on edits (made here: ****) pay considerable attention to Diamond pattern proxies. While the authors discuss adding more explicit data like whether the contract is a proxy, whether it’s upgradeable, etc. to the audit summary, the ultimately choose to leave these specifics up to the ecosystem to decide.

The authors devote a considerable amount of time to whether runtimeCode or deploymentCode is better for representing the underlying implementation contract to which a proxy points.

Ultimately, the choice between runtimeCode and deploymentCode is perhaps the biggest debate at the heart of an addition. It points to a key trade-off: how precisely can we match the implementation while still making it easy to retrieve and verify?

What’s the difference between `deploymentCode` and `runtimeCode`, and why does it matter?

If we peel back a layer, we recall that proxies aren't magic. At their core, a proxy is simply a standardized use of the delegatecall instruction, and this instruction takes a hash chain consisting of some address and some runtimeCode.

However, while runtimeCode contains the code that gets executed on the EVM, it crucially doesn’t contain constructor logic or constructor args. This constructor logic can include critical security mechanisms like role assignments, which if altered can lead to vulnerabilities.

deploymentCode solves some of this problem in that it also contains the constructor and any initialization logic, which runtimeCode lacks. However, it’s more difficult to retrieve (we can’t just call web3.eth.getCode, as we can to retrieve runtimeCode, we need to grab the bytecode from the transaction that created the contract), and it still doesn’t contain constructor args.

How Can a Lack of Constructor Args Get Us Into Trouble?

Let's consider the example of Factory contracts. In the Ethereum ecosystem, a Factory can deploy multiple contracts that look exactly the same—with the same runtimeCode and deploymentCode—but with different constructor args. These differing constructor arguments can trigger completely different behaviors, introducing new attack vectors and opportunities for rug pulls.

Let’s consider a few examples:

Uninitialized State*:* If a contract expects some state to be initialized through the constructor but it isn’t, this could lead to vulnerable behavior. Consider an admin role that is supposed to be set in the constructor. If improperly initialized, it may happen that any address (or no address) is able to claim the admin role post-deployment.
Parameter Misconfig: Constructor arguments often include configuration parameters like fees and spend limits. These can also include thresholds for stopping mechanisms like circuit breakers. If these params are set incorrectly, they could either create an exploit vector or make the contract unusable.
Oracle Initialization: If a contract relies on an oracle and the address is set through the constructor, an incorrect address or a malicious oracle could feed wrong data.
Initial Minting*:* In token contracts, the constructor may define initial distributions or directly mint tokens. Incorrect arguments could lead to an imbalance, creating opportunities for a "rug pull" where the initial recipients could crash the market by selling off their disproportionally large holdings.
Initial Governance Parameters: In DAOs or other governance-based systems, constructor arguments could include initial voting powers or other governance params. An incorrect setup can lead to a governance attack.
Whitelists/Blacklists*:* In contracts with restricted access, the constructor often sets initial whitelists or blacklists. Incorrect settings can lead to unauthorized access for hackers or block legitimate users.
Time-Locked Functions: Some contracts have functions that are time-locked or can only be called after a certain block number. This block number may be set in the constructor. An incorrect setting could either lock functions indefinitely or make them unexpectedly available immediately.

Approaches to codifying constructor args

Because of the ambiguity that mechanisms like Factories can introduce, it's important for audits to point to a precise state of deployment. Constructor arguments play a critical role in this process. Below are our initial thoughts on two approaches to codifying this information:

Approach A: Pre-deployment Auditor Verification. The protocol being audited can provide the exact constructor args that will be used at deployment. The auditor can take all of the constructor args in sequence, squash them together, and provide a hash of that squashed object. Anyone looking to verify that this implementation not only matches the runtimeCode and deploymentCode expected would also be able to grab all of the arguments in the input data field of the initial transaction, concatenate them together, hash this, and verify the hash against the auditor-provided hash to be certain that these args match the expected args.
- This concatenation method could be standardized, providing an interface for raw arguments (pre-deployment) and an interface for interpreting input data scraped from an initialization transaction (post-deployment).
Approach B: Post-deployment Auditor Verification. The protocol being audited could also opt to do such verification post-deployment. The central idea here is a bit like etherscan, where even though the input args on the initialization transaction might not be immediately interpretable, they could be provided to a registry in a human-readable form and verified against the initialization transaction. To make this even better, an auditor could attest to the link between certain constructor args provided by the protocol and the values stored

Looking ahead

We made it! We are enormously grateful for your attention and hope you walked away from this article with new insights/questions and feel more thoroughly prepared to make an active contribution to this EIP and its goals.

EIP-7512 is an incredibly exciting step towards deeper security throughout the ecosystem. This initial proposal is a small step––one that we believe would be bolstered by the addition of key audit metadata including individual auditor identification, dynamic updates, scope parameters, and auditor confidence.

It’s incredibly exciting to be here in this critical moment for blockchain security. If you want to join us in making a follow-up to this research, please reach out.

Safety is a community effort. Anyone is welcome to share their reflections either in the comments or directly––our DMs are open.

Again… the goal isn't to compete with other groups (whom we deeply respect), but rather to help them build a technology standard that increases the probability of reliable mass adoption. We hope our materials fill you with optimistic energy! Peace and evolution, anon!

Peace and evolution, anon!