This article is jointly published by X-explore and WuBlockchain.
According to a recent article by WuBlockchain, the trading volume of cryptocurrency exchanges has increased significantly since the opening of 2023. Derivatives trading on the main exchanges rose 47.6% this January compared to the previous month. These such rocketing trading volumes are probably mingled with fake trades on the CEXs as the article indicates.
Wash trading on cryptocurrency exchanges is not a new topic. Due to the lack of regulation on centralized exchanges, some CEXs tend to wash their trades to cover up the true trading volumes in order to improve their rankings in trading volume. So as with the following contents, we construct methods of trading volume analysis to identify and elaborate the fake trading volume.
We obtained historical K-line data, real-time depth data, and real-time transaction data of several mainstream exchanges through open APIs. These exchanges include Binance, Bybit, OKX, Bitget, Phemex, Kucoin, and Dydx.
We chose BTC future contracts as our first detection analysis. By deep diving into the volume of BTC futures of each exchange after 1st September, 2022, some similar patterns could be observed across most exchanges.
Some similarities/differences in the volume across exchanges:
The general shape and trend of the volumes were somewhat uniform. With similar spikes in volumes at the same period of time except for Phemex
The periods of the peak and valley volume coincided with each exchange except for Phemex.
Referring to the chart below, the shape of Phemex's trading volume was abnormal compared with other CEXs.
The picture below showed the price and trading volume chart of Phemex's uBTCUSD trading pair. It can be clearly seen that the trading volume increased rapidly from 19 Aug 2022, and staged a call back for this such insane trading volume starting from 16 Sep 2022 until 19 Nov 2022. After that, without any sign, the volume fell sharply and turned the level down to the trading volume before 19 Aug 2022. From Jan 2023, the volume seemed to rebound again and continue to keep at a high level.
Common sense is that price fluctuating volatility on the market would result in spikes in trading volume as investors will close off their trades, or open new ones. From our experience, periodic huge volume tends to follow huge dips/rises in prices, even more in the futures/derivatives markets.
By considering that cryptocurrency has the characteristic of leverage trading, any vigorous movement in the market could be amplified and cause a take-profit or a stop-out. This would be a big reason for a soar in volume. So we targeted Phemex trading volume change when prices moved sharply as one of our investigations.
Before 10 Aug, 2022.
We selected 5 mins K-line data for the following.
There's some kind of positive correlation between candle height and volume difference [lag 1]
Looking into only 4th Dec, 2022 [we separate the axis for a better view.]
A huge market movement along with volume fluctuations
After 10 Aug, 2022.
Here, we look into the 5 mins data same as before.
There is a huge market movement around 19th Aug, 06:35.
❌ However, there isn't any volume difference.
At 19th Aug, 2022.
❌ Taking a closer look here, it seems that there's not much correlation.
💡 There was a sudden spike in market movement. However, there wasn't any volume difference.
There seems to be a huge change in volume. [Lag 1]
❌ However, there doesn't seem to have been any many changes in the market prices.
At 26th Aug, 02:35, there was a sudden spike in Volume (as seen from the large lag 1 difference).
❌ However, there wasn't any price difference.
💡 With a large volume difference, there wasn't any large market movement.
Trading Volume spikes are somewhat along with huge market movement - [Before 10 Aug]
Prices and Volume seem to be independent, as spikes in volume and market movement have weak/no correlations. [After 10 Aug]
In order to be more quantitative, the comparison with the volume price correlation between different exchanges, we calculated the correlation coefficient of each exchange.
To quantify this, we calculate the (Pearson) correlation coefficient between the candle length and the volume.
The per-day correlation was calculated and plotted on a graph against time.
Pearson's product-moment correlation coefficient was used here as it could represent the strength of the putative linear association between the variables.
A correlation coefficient of zero indicates that no linear relationship exists between two continuous variables, and a correlation coefficient of −1 or +1 indicates a perfect positive and negative linear relationship respectively. A good gauge of correlation can be found in the table below.
We mainly analyze BTC contracts, but also include an ETH trading pair.
Taking the same K-Line data from various exchanges to compare the correlation between their candle height and their volume with these parameters:
We'll look at the correlation from Sept 2021 onwards (where possible)
We'll look mainly at the 5 mins K-line intervals.
Taking Bybit and Binance as an example for now.
Looking at the correlation per day:
Taking a Simple Moving average with window of 10. (SMA10)
SMA10 is increasing
BTCUSDT SMA10 holding above 0.8, with ETHUSDT holding above 0.7
Looking at the correlation per day:
Taking a Simple Moving average with window of 10. (SMA10)
SMA10 is increasing
Both SMA are mostly above the 0.8 threshold.
Adding other exchanges into the same graph, as well as ETHUSDT data.
Of the exchanges:
Dydx's correlation coefficients show periodicity before May 2022. In August 2021, Dydx started the trade-to-earn activity, with a round of epoch every 30 days, and the distribution of rewards is mainly based on the user's transaction fees. Therefore, when the user's fee expenditure is less than the rewarded, it brings wash trading. Further, before the end of each period, the trading volume rises sharply (20220215, 20220314, 20220412). With the rapid decline of the DYDX token price to less than $2 in May 2022, transaction mining is no longer profitable, and the correlation coefficients of Dydx are gradually recovering.
Binance, Bitget, OKX as well as Bybit have a similar SMA10 correlation coefficient, which holds constantly above 0.75, indicating strong linear relationship.
Kucoin's XBTUSDTM is linear contract, due to the limitation of the API, we can only get the data since December 2022. The SMA10 ranges from 0.56 to 0.71, indicating a weak linear relationship.
Phemex's uBTCUSD SMA10 dropped significantly after May 2022, where it ranges from 0.11 to 0.43, indicating weak to no linear relationships. The correlation coefficient returned to normal levels starting on November 20, continued until January 5, and then decreased again. In line with the changing trend of transaction volume which is also consistent with the information in Wu blockchain's article. This is different from other exchanges.
Combining the correlation analysis and the changing trend of trading volume, we can infer that Phemex has the behavior of fake volume, and fake volume will be traded when the market price is stable. Abnormally enlarged volumes often occur during this period.
We also analyzed the recent trades/transactions of a few exchanges.
Large market movement tends to be large transaction volumes. This would generally be due to:
A sudden increase and influx in the number of trades.
Trades that interact with higher volumes.
We took a look from 4th Nov, 08:00 to 23:59：
We specialized in individual trade sizes. Here, we plotted the individual trade sizes as well as the K-line candle length for the corresponding timings.
Raw Phemex's trades.
Overlaying the K line volume, we were here to see if there are any patterns between a higher K line volume and a larger (single) trade
Raw trade view
Overlaying the reported K line volume over the individual trade data.
Zoomed in view
K line trade volume grouped over the hour.
No obvious relations between the raw trades and the K line volume. The size of the trading order did not show randomness.
Compared with Binance, the trade sizes were relatively stagnant. What changed was the density of trade and not the individual trade sizes.
Depth is considered an important measure of exchange liquidity. Traders prefer markets with better depth because of lower transaction costs. We investigate the orderbook depth of the various exchanges to determine if there are any patterns or trends that might be observed.
The orderbook depth data was recorded every 10 seconds.
Taking the 1st layer's bid and ask, we derived the spread at a particular time. We plotted it against time to get a plot.
Raw Spread of the exchanges.
Spread, averaged by minute.
Questioning that there would be such high volume traded on Phemex, considering that their spread isn't the tightest.
Here, we'd like to see what QTY of coins we'll get if we set a specific amount of slippage. The more contracts traded, the better liquidity of the exchange.
Here, we set the slippage amount to be 10 USD/T.
(We calculated the number of coins [BTC] that is in the book before a 10 USD/T Slippage from the mid point)
Looking into the number of coins we'll get before a slip of 10 USD/T occurs.
Turning into Phemex's uBTCUSD volume
It is observed that:
The book's depth was inconsistent. Ranging from 0.1 to 50 coins.
The same was observed for the asks.
We also tried another way. We fixed the amount spent and determined the percentage slip that would result from that.
For this experiment, we first set the amount spent at 100K/150K/200K USD/T.
We calculated the amount of coins bought, and the average price of each coin.
Here, we plotted the coin volume vs slip (From mid-price). We need to pay attention to the number on the x-axis. The change in the value of the y-axis was due to the fluctuation of the market price.
To conclude, Binance and Dydx seem to have really good book depth and prices. Kucoin and Okx follow behind, and Phemex has some of the largest slippages.
After the comparative analysis of trading volume and price changes, and the comparison with the order depth, it can be reasonably speculated that Phemex is suspected of the fake trading volume.
In this part, we will introduce a method for estimating the real proportion of reported trading volume.
We built a model with the following
Observing that there was a weak correlation between Phemex's price fluctuations and trading volume, we assumed that fake transactions on the exchange mainly occur during periods of small price fluctuations. That is to say, the trading volume during significant price fluctuations is real. This is a conservative assumption for estimating the proportion of false trading volume。
Divide the trading volume of exchanges with normal trading volume (Binance, OKX, etc) when the market fluctuates rapidly by the trading volume of the whole day to get the ratio R.
Divide the trading volume of abnormal exchange when the market fluctuates rapidly by the ratio R to estimate the real trading volume.
Using the selected days, we calculated the percentage of each day's volume as compared to the corrosponding daily volume of that exchange.
On the scatter plot, at each datetime (per 5 mins intervals), usually lower than 10% of the daily volume, with the exception of a few.
Most points on the scatter plot show that the percentages vary between 1% and 8%.
Zoomed in plot
Phemex's Spike volume ratio tends to be significantly lower, with a gap separating Phemex's plotted points from the other exchanges.
Phemex's spike ratio for the datetimes seems to vary between 0.02% and 0.09%, with some occasional points above 0.5%.
It is observed that OKX seems to have the highest percentage.
Phemex's plotted points seem to be below the 1% mark for most days.
Using the mean of the ratio per datetime and Phemex's spike k-lines' trading volume, we look to estimate the real volume of Phemex.
Taking the estimated volume per day, we compared it against the reported volume data and showed the results in a line plot below.
There were some obvious disparities, and the estimated volume lies entirely below the reported volume.
The daily ratio of estimates vs report volume ranged from about 3% to 92.8%
Taking a simple ratio over the sum of volumes on all dates, the min/max estimated volume ranged from 5% - 25% of the reported data.
Using the same method to estimate the true volume ratio of Binance and Bybit, the result was very close to the reported volume.
We have adopted various public data from the most popular centralized exchanges
Phemex's trading volumes showed a different pattern than other exchanges. The sudden increase in volume on Phemex's platform seems interesting, which is abnormal.
The correlation of trading volume vs kline height data is seen to be significantly lower for Phemex, compared to any other exchanges. This indicates that Phemex's trading volume was not solely dependent on the market movement, and there might be other forces driving their volume.
Phemex seems to have the highest spread and slippages across most exchanges. This should have deterred some clients from trading.
Running a ratio analysis estimates of some spiked volumes, the estimated volume stood between min/max 5% - 25% of Phemex's reported volume.
By deep diving into individual trade data, Phemex's trade sizes did not change much even when there were market movements, or when the Kline volume data was reportedly high. In contrast with Binance, there were (single) large volume trades when there were market movements that caused huge kline volume.