Perennial is a derivatives primitive that allows for the creation of two-sided markets that trade exposure to an underlying price feed. In other words, each epoch, Chainlink price feeds are used to determine the winning side of the market, and transfer assets from one side to the other.
I saw they’d recently launched a bounty on Immunefi, and since there was a lull in Sherlock contests and I had a couple free days, I decided to take a look.
The protocol runs in epochs called “versions”. Each version occurs when Chainlink releases new data on the price feed, and the protocol updates and adjusts balances accordingly.
Without getting too much into the details:
The settle()
function is called by all important protocol level actions (or can be called by any external actor) and kicks off the global settlement flywheel.
The Oracle grabs the latest data from Chainlink and converts it into an OracleVersion struct.
If we're in a new round (with special logic for if multiple rounds have passed), we settle up positions, fees, and collateral balances at the "product" level.
Whenever important protocol actions are taken at the account level, it calls _settle()
to ensure the global flywheel is up to date, and then settles up the individual account to ensure it's in line with the product level.
I quickly turned my attention to the Chainlink Oracle.
The first thing that jumped out was a strange source of complexity. Perennial uses “versions” to represent the chunks of new data, but Chainlink organizes their rounds slightly differently, so there was some translation that needed to happen.
What exactly does Round ID mean on Chainlink? When you’re querying an aggregator directly, Round ID is a simple ever-increasing value.
However, when you query a proxy, there needs to be a way to differentiate which aggregator it’s from.
The way Chainlink accomplishes this is that they use the least significant 64 bits for the round ID, and use the most significant 16 bits to store which “phase” the price feed is in (in other words, which aggregator to use).
This means you can take roundId >> 64
and it will return the phase.
Similarly, we can take roundId | type(uint64).max
and it will return the round ID within the given phase.
The roundId
is a uint80
(10 bytes), so you can picture it like 0xPPRRRRRRRR
, where P
is phase and R
is round.
This makes sense, but the result is that the numerical values of these round IDs can vary wildly. Adding a 1 as the 65th most significant bit represents adding 2^64 to the number, so these values aren’t very easy to work with.
(source: https://docs.chain.link/data-feeds/price-feeds/historical-data)
What Perennial wanted to accomplish was to have their “version” number increment by 1 each round, regardless of whether there was a new Chainlink phase.
They accomplished this as follows:
They store an array called _startingVersionForPhaseId
, which begins with a value of [0, 0].
Each new Chainlink phase, they store the version that corresponds to the first round of that phase. For example, if there is a new phase that starts in the 20th round, the array becomes _startingVersionForPhaseId = [0, 0, 20]
.
For any future round ID returned by Chainlink, they can calculate the phase (using round >> 64
), find the starting version for that phase (using the position in the array), determine which round we are on within that phase (using roundId | type(uint64).max
), and add the starting version for the phase to the round within the phase to get the current version.
The result is that we can easily calculate version from round ID.
So, whenever the phase ID has increased, we need to add a new value to _startingVersionForPhaseId
. Here's how this is implemented in Perennial's code:
while (round.phaseId() > _latestPhaseId()) {
uint256 roundCount = registry.getRoundCount(base, quote, _latestPhaseId());
_startingVersionForPhaseId.push(roundCount);
}
Do you see the problem?
Here's a hint: The roundCount
variable returns the total rounds within the given phase.
In other words, this code takes the total rounds within the given phase and pushes it on to the array. The first time this happens, it'll work fine, because the rounds in the phase is equal to the total number of rounds in the history of the protocol. In the example above, this would successfully push 20 onto the array, giving us an array of _startingVersionForPhaseId = [0, 0, 20]
.
But what about the next phase shift? Let's say Chainlink shifts to a new phase again 20 rounds later. Ideally, we should now be pushing 40 onto the array, keeping our calculations correct. But if we look at the above code, the roundCount
after 20 more rounds will only be 20.
This is an issue. We shouldn’t be adding the number of rounds in the last phase to the array. We should be adding the total number of rounds that have ever occurred up to this point.
As soon as I saw this error, I knew we had a bug. The problem was just to understand if it could be exploited, and what damage it could cause.
It was obvious that, if this value was off, the translations between round ID and version would be off. So my first thought was to find all the places in the code that used this translation, and see where it could become a problem.
Before that, I needed to understand how exactly it would misbehave.
The core function that used this information was the _versionToRoundId()
, which takes the following form:
function _versionToRoundId(uint256 version) private view returns (uint80) {
uint16 phaseId = _versionToPhaseId(version);
return registry.getStartingRoundId(base, quote, phaseId) +
uint80(version - _startingVersionForPhaseId[phaseId]);
}
How would this function act in the event of an incorrect _startingVersionForPhaseId
array?
Here's what would happen:
The final value in the array would be lower than it should be
phaseId
would return the position of that final value (len(array) - 1
)
registry.getStartingRoundId
would return the correct starting round (from Chainlink) for the most recent phase
version - _startingVersionForPhaseId[phaseId]
would return a number of versions much higher than it should, since it's subtracting the smaller final array value from the correct current version
Let's look at the above example. We've changed phases at round 20 and round 40, and the array is now _startingVersionForPhaseId = [0, 0, 20, 20]
. We call _versionToRoundId()
for version 41:
phaseId = 3
registry.getStartingRoundId = 0x0300000001
(phase 3, round 1 within the phase)
return value (converting to hex for consistency): 0x0300000001 + 0x28 - 0x14 =0x0300000015
But this is actually 20 rounds in the future. The function will revert and continue to revert for 20 rounds until Chainlink catches up.
This is actually somewhat fortunate. Incorrect oracles can cause nasty problems if they return the wrong values, but the only issue caused here would be a freezing of the protocol, so the downstream uses of this function would not be exploitable. It will simply cause all syncing to revert until Chainlink catches up.
However, this syncing issue can go on for quite a long time. The total locked time caused by Chainlink upgrading to phase X can be calculated as:
lockedTime = (rounds in history of the protocol before phase X - 1) * length of Chainlink round
This seemed like a substantial enough problem to be worth submitting, so I wrote up a report and submitted it to the Perennial team.
I submitted the following simple POC along with my submission.
Chainlink updates the phase ID of a feed. This adds roundCount to _startingVersionForPhaseId. Let's imagine it's been 2 years and the oracle updates hourly, so roundCount would equal roughly 24 * 365 * 2 = 17520. The new array is [0, 0, 17520].
Six months later, Chainlink updates the phase ID again. This adds roundCount to _startingVersionForPhaseId. In six months, roundCount would equal roughly 24 * 180 = 4320. The new array is [0, 0, 17520, 4320].
At the next settlement, Product.sol#atVersion() is called with the latest version as a parameter. In this case, latestVersion will be the version right before the phase update, which will equal 17520 + 4320 - 1 = 21839. This calls oracle().atVersion(21839).
This function begins by calling _versionToRoundId() with the version passed. This calls _versionToPhaseId(), which will return the current phase because version > _startingVersionForPhaseId[latestPhaseId]. This calculates the starting round id for the phase (which will equal the current round id, because it's looking at the new phase), the version (which will be 21839) and _startingVersionForPhaseId[phaseId] (which will be 4320). It will return currentRound + 21839 - 4320.
Because this round is 21839 - 4320 = 17519 rounds in the future, the call to registry.getRound() (which calls Chainlink's Feed Registry with getRoundData()) will revert, with an invalid round error.
This error will continue until the round is realized, which will be in approximately 17519 / 24 = 730 days = 2 years.
Because we will be unable to call settle() and move between versions, the Products will all become locked until this error resolves.
This was my first day trying out Immunefi, so I wasn't sure what to expect. I'd heard rumors that protocols might be selfish, difficult to deal with, slow, etc.
I had the exact opposite experience:
The team got back to me in 30 minutes that they were exploring the issue.
They confirmed it 19 hours later and laid out a fair bounty offer.
Payment was sent by the end of the day.
I was extremely impressed with the Perennial team and plan to take a closer look at their contracts when I have more time, as I know they treat white hats fairly and respectfully.
This was a good first day, but if my goal was to reach Top 10 on the Immunefi Leaderboard by the end of 2023, I needed to find something a lot more serious.
Luckily, I received a mysterious message the next day that would lead me down a rabbit hole towards my first major find. Look out for the write up soon :)