Our team at Migalabs decided to participate in The Merge Data Challenge organized by the Ethereum Foundation. During that project, we created a website to visualize and compare validators’ rewards one month before and after the merge. We have decided to share all the data that we gathered with the ecosystem so that other researchers and developers can look at the numbers and use them in their own analyzes.
“The data can be accessed at our big query project:
using* SQL format. Please make sure that you have configured a billing account with Big Query. See here how to configure your billing account.”*
Here is the explanation of the data (and its structure) we collected for the experiment. Note that all the below tables were obtained from two source tables: the first contained one row per validator and epoch (so, around 500K rows per epoch) and the second table is the link of each validator to a single staking pool (when possible). As for resource issues, we have decided to expose the set of tables that were used to build both the website and the blog post of the experiment. These are the result of performing aggregations of the source tables and are a lot less resource-consuming.
This table shows statistics about each measured epoch, including the number of active validators, attesting validators, total balance, total attesting balance, total effective balance, sum of missing source, target and head and missed blocks in the epoch.
This table contains, for each measured slot (post-merge), the proposer index and the earned rewards at the execution layer. Note that this table contains half of the information, as during the 7K epochs collected before The Merge we could not link the execution layer rewards to a validator.
This table contains, for each validator a set of statistics summarizing both: pre and post-merge measured epochs. Therefore, for each validator, we will find two rows in this table, one for pre-merge and one for post-merge. Among the statistics, we will find number_lines (number of epochs collected) and three columns, summarizing the number of missed source, target and head by the given validator during the given period.
This table contains, for each validator a set of statistics summarizing both: pre and post-merge measured epochs. Therefore, for each validator, we will find two rows in this table, one for pre-merge and one for post-merge. Among the statistics, we will find the total amount of rewards obtained (consensus layer) and the max rewards that could have been obtained by the given validator during the given period.
This table contains one row per staking pool, epoch, and status. At each row, we will find how many validators belonged to the given pool at the given epoch with the corresponding status. There are 4 different statuses: “in the queue to activation”, “active”, “slashed” and “exit”.
This table contains one row per pool and epoch. At each row, one will find the average reward obtained by the validators in the pool, the average max reward that they should have obtained, the count of missing attestations (source, target, head), and the number of active validators.
This table contains one row per validator and period (pre/post-merge). At each row, one will find the average reward that the validator obtained and the average max reward, which is the reward the validator could have obtained.
This table contains one row for each validator and the slot at which it had the duty to propose a block. In this table one will find whether a validator had the duty to propose a block in a given slot and if it actually proposed the block or not.
This table contains a summary of all the above tables. One will find one row per pool and period (either pre or post-merge) with statistics as a result of all the above tables. This table was used to create most of the figures and analytics for the study. However, please continue reading below in case you want to dig deeper.
Here you will find several examples of metrics that one can obtain from the data presented above.
Let’s say one wants to know how many times a validator had the duty to propose a block. Then, this information can be found in the proposer_duties table, filtering by the field f_val_idx and the epochs we want to include in the query.
SELECT f_val_idx, count(distinct(f_proposer_slot)) as number_proposer_duties FROM `merge-data-challenge.merge_metrics.proposer_duties` WHERE f_val_idx = 131072 GROUP BY f_val_idx
Suppose someone wants to know what is the reward of a specific pool in a period. In that case, one can query the table Mview Pool Rewards, group by the epochs to include in the query, and apply the corresponding aggregate function (either average or sum, for example).
SELECT f_pool_name, avg(f_avg_reward) as average_reward FROM `merge-data-challenge.merge_metrics.mview_pools_rewards` WHERE f_pool_name = 'coinbase' and f_epoch > 146875 GROUP BY f_pool_name
If someone wants to know the number validators per status, one can query the table Mview Metrics Val Status, where the status can be filtered and, at each epoch period (either pre or post-merge), one could find the number of slashed validators for example. One can also query per specific validator.
SELECT f_merge, f_status, count(f_status) as count_status FROM `merge-data-challenge.merge_metrics.mview_metrics_val_status` GROUPBY f_merge, f_status