Navigating the DeFi Pricing Challenge with Mobula’s Octopus Engine

In the realm of DeFi, asset pricing remains a complex challenge. Existing platforms like CoinMarketCap offer incomplete coverage, requiring manual asset additions - and are moreover totally inefficient on low-volume assets, while DEX explorers such as DexScreener can only visualize trading pairs, not the whole market price of assets. At Mobula, we're bridging this gap with our market data engine, Octopus.

Introduction

Octopus harnesses real-time multi-chain DeFi event streaming from +30 blockchains to maintain a current state of liquidity pairs and compute prices of millions of DeFi assets. This removes the need for updates from outside and manual inputs, making the coverage both complete and instant.

Octopus allows aggregated price computing, based on volume & on-chain liquidity as a fallback, updating prices by recomputing the LPs ratios of all the assets every second. Our volume system is engineered to reflect true market activity, discounting volume spikes that could skew asset valuation & erasing MEV-bot or flashloans price manipulations.

Octopus solves pain points from current solutions:

  1. Current reliance on manual work for assets tracking (CoinMarketCap team has to manually list a given asset for it to be tracked)

  2. Lack of coverage on new DeFi-bootstrapped tokens first hours/days of trading (for example, Pepe has been added to CoinMarketCap +7 days after its launch)

  3. Lack of on-chain aggregated coverage for tokens & assets (CoinMarketCap tracks assets both on & off-chain in a fully aggregated way, which means during depeg events, or on-chain exploits, there is no way to track aggregated price of the underlying tokens)

OK, but why can’t we use DexScreener instead? Pair-based data providers are great for some usecases, but as they are pair-based, they don’t cover the full picture, and aren’t resilient to exchange migration, protocol migration, or stablecoins depegs - making aggregated prices much more suitable for lending protocols, portfolio pricing, checkouts, and virtually all pricing usecases that aren’t trading on the given pair. Let’s dig into how Octopus works.

Pricing computation

One of the issues of AMM pricing is that everything is based on ratios - tokenA is worth Y tokenB, not X USD. Thus, you can only compute tokenA in $ if you know tokenB in $. But, in a closed state, you’d also need to compute tokenB price, based on… tokenA, or maybe tokenC if tokenB has two liquidity pools.

To solve this problem, there are two ways around:

External pricing from cherry-picked assets

We can imagine cherry-picking some assets (say ETH, BNB, UNI) & track them off-chain, i.e. on Binance, or through CoinGecko. This would solve the chicken&egg problem, given a first price to tokenA, allowing us to compute tokenB, and so on.

However, this exposes us to 2 pain points: depegs and scale changes. One of the cherry-picked assets could depeg (1) because the CEX we track bankrupts, (2) because the on-chain contract is exploited, i.e. Gala with its wrapped BNB chain version. A single asset exploding can have devastating consequences on the prices of the whole engine, as… everything is ratios. If Gala is worth x1000 less than it should on a CEX, tokenB will be computed worth x500 more than it is - tokenC (if paired with tokenB) will thus be computed worth x250, and so on…

A similar thing can happen with scale changes - decimals amount changed, or supply number changed, and CEX price dramatically increases x100000 - the project deployed a new contract, but Octopus didn’t catch up (because it requires a manual intervention) and we still believe the old contract is worth x100000, leading to the same propagation of incorrect prices. Let’s take an example:

Now, imagine that C is injected as worth $1000 - we’d recompute B & A the following way:

Pb=RAB×Pa×La+RCB×Pc×LcLa+Lc=52×2×5000+13×1000×45005000+4500=160.56..P_b = \frac{R_{AB} \times P_a \times L_a + R_{CB} \times P_c \times L_c}{L_a + L_c} = \frac{\frac{5}{2} \times 2 \times 5000 + \frac{1}{3} \times 1000 \times 4500}{5000 + 4500} = 160.56..

Then, we recompute USD price of token A.

Pa=Rba×Pb×LbLb=25×160.56×20002000=64.24..P_a = \frac{R_{ba} \times P_b \times L_b}{L_b} = \frac{\frac{2}{5} \times 160.56 \times 2000 }{2000} = 64.24..

Basically impact from token C price propagates to all the other prices…

This solution’s main pain point is the need to maintain manually the cherry-picked assets, and thus the potential discrepancies of the prices.

Stablecoin pricing

The other solution to the USD pricing problem is to manage a whitelist of stablecoins, and consider them pegged to the dollar, no matter what. This also solves the chicken&egg problem, as we’re able to price tokenA = $1, thus tokenB, tokenC, etc. can be computed.

This is also much more resilient than the previous system as it doesn’t rely on external providers like Binance or CoinGecko, and is totally insensitive to scale changes.

The only remaining issue is the depeg one - if a stablecoin we consider worth $1 isn’t worth $1 anymore… all our prices will be slightly off depending on how much it depegged, and how liquid it was. This implies to build a monitoring solution (which we’re going to open to public soon, fill this form if you are interested in getting early access) for stablecoin depegs, removing a stable from the whitelist when it’s traded +/- 0.1% from the dollar. This mitigates by a huge factor the existing risks and is a near-perfect solution.

Price ponderation

Not all trading pairs are created equal. And not all of them as equally meaningful when it comes to an asset price. Main complexity of the aggregated price computation in DeFi is the insane amount of broken pairs, where Bitcoin is trading at $100000000, or 0.00001$ and a super small amount of liquidity (or, millions of $, as we’ll see later on).

Simple rules like the volume weighted average doesn’t work great, for 2 reasons:

  • Most assets don’t even have 24hrs volume as they’re near-death coins - without volume, no volume weighted average

  • Even assets with volume often have totally broken pairs, and if one of these pairs is activated by a trade, your weighted average won’t last long - imagine a $10 trade on the 200000000000000$ Bitcoin pool, and 100,000,000$ on pools with Bitcoin at normal price ($37,000 as of this article) - you’d get:

    10000000037000+20000000000000010100000000+10=20036997\frac{100000000 * 37000 + 200000000000000 * 10}{100000000 + 10} = 20 036 997

    The main pain point is that there is no limit to how broken a pair can be - and it can reach scales where ponderation by volume isn’t enough to counterweight the ridiculous $10 vs hundreds of millions traded daily.

The ponderation mechanism we apply is the following:

  • We first try to find volume on the given asset - if it has some, we will apply weighted average to the pairs having 5% or more of the total volume of the asset, thus excluding the broken pairs.

  • If the asset doesn’t have any volume, we fallback to liquidity weighted average, pricing in USD liquidity pools of the trading pairs to give weight to each of them

When ponderating liquidity pools, we found out that some LPs have millions of dollars in ONE side, without anything on the other side (on concentrated AMMs like Uniswap V3). The pair 0x27807dd7adf218e1f4d885d54ed51c70efb9de50 on Arbitrum is holding $20,000,000 worth of MIM, and $2,500 worth of USDT! Leading to, again, totally broken prices for MIM:

We thus had to exclude the “token-side” liquidity to focus on the “other token” side, in weighted averages - as anyone could create a random token, add 1B of it alongside 1 USDT, set the cAMM pool in such a way that 1 token = 1 USDT, then create another pool with say USDC, with a different setting - say 1 token = 0.5 USDC, and 1B of “token” as well. This would make Octopus think that those LPs are super important - billions of dollars locked - and that USDT and USDC are worth 0.75$ (by calculating the token price first). Using the weighted average focusing on the “other-token” side excludes these kind of cheats! as you can print your new token, but you can’t print 1B USDC just for the sake of attacking Octopus.

Also, same as the broken pairs for weighted volume average, we had to create a threshold ($1,000) to take in consideration prices from a given LP - issue being that below this amount, incentives for arbitrage vs gas fees might not be high enough to ensure that the pair is synced with the market price.

The final formula is then (basic weighted average, excluding < $1000 pairs):

Pagg=P0×L0+P1×L1+...+Pn×LnL0+L1+...+LnP_{agg} = \frac{P_0 \times L_0 + P_1 \times L_1 + ... + P_n \times L_n }{L_0 + L_1 + ... + L_n}

Done with price computation! Let’s now dig a little bit into how we’re getting the data.

Data feeds

Octoflow: Pulling events vs Streaming events

While building Octopus, we realized there was no lightweight framework simplifying data streaming & back-filling, production-grade - as of our knowledge. So we decided to build one - Octoflow fits in < 500 lines of code, on top of Web3JS, and comes with:

  • Automatic backfilling

  • WSS retries & reconnect when down

  • Support for streams even on non-WSS nodes (which means streaming from public nodes of any chain, basically)

  • Dead-simple typescript handlers

Using Octoflow, we built our indexers for pairs creation & swaps for 6 AMM protocols in 1,800 lines of plain typescript, directly integrated with our Redis, with no additional CLI, package, langage, wizard tricks or external hosting required…

NB: Octoflow definitely doesn’t replace complex systems like The Graph - it’s just much easier & flexible for simple indexing needs. We’re still looking to open source it in a near future.

Coverage

Currently our data feeds cover 30 blockchains, and we’re working on a solution to open to external support for new blockchains.

The Foundations could build the mappers & data-streams to integrate to Octopus directly. This is a few months down the line, but will likely help solve the coverage pain point that the industry has regarding data APIs!

Same logic applies to AMM protocols currently covered - Octopus streams the top 6 by TVL, but mappers could be built to support all of them with the community’s support.

Zooming out

Octopus is part of Mobula Market Data API. Our goal is 100% harmonized coverage (CEX, DeFi) of cryptocurrencies - something that no one has achieved so far.

If you’re interested in trying it out, visit developer.mobula.fi.

Mobula’s Market Data API is, on top of being more accurate, is more efficient (and thus cheaper) than most existing API infrastructures.

If you’re an enterprise looking to switch from your current data solution, get in touch with the Team on Telegram or by email to get custom SLAs or self-hosted service.

Thanks for reading… 🤠


Mobula Labs is a small team, we’re looking for talented engineers to join us - ping @NBMsacha on Telegram or mail@sacha.pw by email to get in touch!

Subscribe to Sacha
Receive the latest updates directly to your inbox.
Mint this entry as an NFT to add it to your collection.
Verification
This entry has been permanently stored onchain and signed by its creator.