GovXS’s Evaluation Framework is designed to evaluate retro funding voting systems. It enables the analysis of voting designs and helps communities achieve their prioritized design goals by identifying optimal voting rules. The framework applies both axiomatic analysis and agent-based simulations to ensure comprehensive evaluation.
GovXS’s resources provided for retro funding voting design evaluation include:
GovXS Presents: Evaluating Voting Designs for Optimism Retro Funding (Workshop with Optimism Badgeholders)
GovXS Evaluating-Voting-Design-Tradeoffs-for-Retro-Funding (Open-source simulation framework to measure how different voting designs perform against several typical retro funding design goals)
A Social Choice Analysis of Retroactive Funding (Formal Description)
Retro Funding, as a key mechanism of the Optimism Collective, relies on voter assessments to distribute funding to projects that create positive impact for the ecosystem. With over 60 million OP tokens already distributed and 850M OP dedicated in total ($1.36B by the time of writing), a significant amount is at stake.
This article serves as Part 3 of GovXS's results report for Optimism Retro Funding, focusing on the risks posed by malicious behavior and the vulnerabilities within the system. We evaluate and compare four voting rules to assess how resilient these designs are against potential attacks. Our analysis asks critical questions: How resistant are these voting designs to malicious behavior, and what steps can be taken to improve the security and fairness of Retro Funding moving forward?
GovXS evaluated the following voting designs applied in Optimism Retro Funding Rounds 1-4.
Round 1 (R1 Quadratic Voting): For Round 1, Optimism implemented Quadratic Voting, allowing voters to allocate tokens with increasing costs to cast multiple votes.
Round 2 (R2 Mean): For the second round, the funding allocation was determined by averaging the votes across all participants, applying the Mean Rule.
Round 3 (R3 Quorum Median): In the third round, a minimum vote and token threshold were introduced, with the Normalized Median rule to reduce the influence of outlier votes. Only projects that met a minimum number of votes and funding were considered.
Round 4 (R4 Capped Median): For Round 4, Optimism introduced the Impact Metric Score, where voters prioritized KPIs rather than direct funding allocations. Projects submitted KPI data to validate their achievements, and funding was distributed based on voter preferences using the Normalized Median, with caps on allocations and redistribution of overflow funds. For better comparison, GovXS includes a version that skips calculating the Score based on Impact Metric Share.
The formal specification of all voting rules in the evaluation is available here.
To assess the "Resistance to Malicious Behavior" design goal, GovXS established five key metrics:
- The evaluation yielded the following insights:
Quadratic Voting proves to be the most resistant to attacks tested.
Median-based rules (R3, R4) show higher sensitivity to malicious behavior, but introducing a cap on the maximum funding allocation per project (R4) significantly improves their resistance.
The Mean rule demonstrates strong resistance to control attacks; however, it is more vulnerable to vote manipulation, as seen in GovXS’s Robustness and Bribery assessments.
None of the voting rules we evaluated are strategyproof and group-strategyproof (design goal “Incentive Compatibility), meaning that colluding voters can still improve their outcomes by misrepresenting their true preferences, as highlighted by the axiomatic analysis below.
Knowledge about other voters' preferences and voting behavior makes attacks more efficient. This highlights the importance of supplementary measures like secret voting and randomly assigning voters to projects, currently being tested in Rounds 5-6.
To assess the resistance of voting designs against malicious actors, one must consider the worst-case scenario: could a malicious actor extract 100% of the available funds? In the context of Retro Funding, GovXS defines a malicious actor (attacker) as a voter whose sole objective is to direct as much funding as possible to their preferred project, using their maximum voting power by allocating 100% of available voting tokens to this project, while disregarding the evaluation of all other projects and allocating zero tokens to them.
Based on the analysis of four different voting designs, we can determine if and under what circumstances an attacker could control the voting, meaning that the outcome is equal to the attacker's vote. We also show, how much funding the attacker would be able to extract in this case.
For R1 Quadratic Voting and R2 Mean, it is theoretically possible for a malicious voter to extract 100% of the funds, but only if all voters cast their votes in the exact same way. This is an unlikely situation, and in such a case, we would not typically classify the voter as “malicious” because they represent the average preference of all participants.
R3 Median also allows for a maximum extractable amount of 100%, but here, it only requires 50% of the voters to vote like the attacker. This makes the Median rule more susceptible to manipulation since a coordinated effort among just half of the voters can ensure that one project receives all the funding. Additionally, the quorum requirement—which sets a minimum number of votes needed for a project to qualify for funding—opens another potential attack vector: if no other projects meet the quorum, all funds can be allocated to a single project due to the normalization process in R3 and R4.
R4 Capped Median presents the highest resistance, as the maximum allocation to a single project is capped at 5% of the round size (500K of 10M OP tokens in Round 4). This limitation significantly reduces the amount at risk due to a malicious actor.
In all the evaluated voting rounds, malicious voters could potentially influence the outcome because all voters could vote on all projects on the ballot. In later rounds, such as Retro Funding Rounds 5–6, this risk is mitigated by randomly sampling voters in funding rounds or project categories, thereby limiting their ability to concentrate votes on a single project.
The following evaluation results are based on agent-based simulations. For all experiments, GovXS created artificial voting data. For vote generation, GovXS uses the Mallows model to generate a matrix of cumulative votes for 𝑛 voters and 𝑚 projects. Each voter’s total vote sums to 𝐾 (in this case 𝐾 = 1) and the votes are generated with randomness controlled by the parameter 𝛼. We begin by generating a base vote for all projects using the Dirichlet distribution. This ensures that the sum of votes for each voter is exactly 𝐾. Noise is then introduced into each voter’s vote by blending the base vote with an independent Dirichlet sample, with 𝛼 controlling the degree of noise. We use 𝛼 = 0.5, which represents a balance between homogeneity (all voters being similar in their vote distributions) and heterogeneity (each voter having individual differences). To verify with real voting data, GovXS used the R4 voting data, publicly available voting data from Optimism Retro Funding. Here, the team extracts the vote values after calculating the Impact Score.
The detailed specifications for our experimental design are available here.
Our first experiment explores Voter Extractable Value (VEV). Building on the worst-case scenario discussed previously, we use agent-based simulations to assess the maximum extractable funding for a malicious actor in a setting with voters with varying preferences. To determine the maximum VEV, we iterate through combinations of voters and projects, aiming to identify the vote-project constellation that maximizes value extraction.
This analysis also accounts for the behavior of a malicious voter, who typically attempts to disguise their actions. Specifically, the model assumes that a malicious voter would not allocate all tokens to a single project but would instead allocate 90-99% of their tokens to one project, with the remainder distributed equally across other projects.
Our results indicate that R1 Quadratic Voting is the most resistant to VEV. The R2 Mean VEV converges more tightly, while both median-based rules (R3 Quorum Median and R4 Capped Median) display a wider spread and higher levels of maximum extractable funding.
Our results indicate that R1 Quadratic Voting is the most resistant to VEV, with a mean extractable funding percentage of 1.50% and a maximum of 1.61%. The R2 Mean VEV converges more tightly, with a mean of 3.27% and a slightly higher maximum of 3.53%. Both median-based rules (R3 Quorum Median and R4 Capped Median) display a wider spread, showing means of 2.98% and 2.48%, respectively, and maximum extractable funding percentages of 6.33% and 4.74%.
When we consider the absolute funds at risk, based on the real votes in Retro Funding Round 4, a clear concern arises with voting rule R3 Quorum Median. In Round 4, it presents the highest risk. With total knowledge, a malicious actor could have extracted more than 1.47M OP tokens in Retro Funding Round 4 by allocating 90-99% of available tokens to their favorite project.
In the actual Round 4 however, due to the maximum cap installed, the extractable value was limited to 500,000 OP. Our analysis proves that this was a great decision to make the median rule resistant to malicious actors.
Following Maximum Extractable Value and Voter Extractable Value, Robustness is the third metric in GovXS’s Evaluation Framework for assessing resistance to malicious actors. This metric examines how different voting rules respond to changes in individual voting behavior and how sensitive the rules are to slight manipulations in votes.
In this experiment, GovXS generated artificial voting profiles using the Mallows model, which simulates realistic voting behavior by introducing slight variations around a central preference (see “Experimental Design”). The team then measured the robustness of each voting rule by calculating the L1 distance, which quantifies the absolute difference between votes.
To assess robustness, we randomly selected a single voter’s profile and modified their vote by assigning it a new, randomly chosen floating-point value between 0 and 1. After altering the vote, we recalculated and compared the manipulated voting outcomes to the original allocations using the L1 distance.
The results indicate how much the voting outcome can be influenced by a vote manipulation, reflecting the robustness of each rule. A higher L1 distance suggests greater sensitivity, indicating that a single vote change can significantly impact the allocation outcome. Conversely, lower L1 distances imply that the voting rule is more robust and less affected by isolated manipulations.
Our simulation shows that R2 Mean is the most sensitive to vote changes. The median L1 distance for R2 Mean is approximately twice as high as that of R1 Quadratic (51,928.84 vs. 23,345.49), indicating a greater sensitivity to alterations in individual votes. In contrast, both R3 Median and R4 Capped Median demonstrate more resilience to changes, with median L1 distances significantly lower than R2 Mean, and closer to R1 Quadratic. This suggests that minor vote manipulations do not affect the outcomes as drastically under these rules.
The fourth metric, Cost of Bribery, quantifies the effort or expense required for an attacker to change the funding outcome for a particular project. To compare different voting rules, we simulate scenarios where randomly generated votes are adjusted to achieve a desired increase in funding. The objective is to determine the minimum number of additional votes needed for a target project to reach this funding goal. Our simulation assumes that any additional vote costs 1 cost unit. Through multiple simulation rounds, we calculate the average number of extra votes (or cost units) required, providing insights into how resilient each voting rule is against attempts to influence funding outcomes.
Our simulation reveals that R1 Quadratic Voting is the most resistant to bribery. Compared to other voting rules, an attacker would need a significantly higher number of vote changes to achieve the same funding increase. Notably, R4 Capped Median Voting also demonstrates strong resistance. This design applies a cap on the maximum funding per project, which makes bribing attacks much harder. For both R4 Capped Median and R1 Quadratic Voting, voters would have to allocate all available votes (in our simulation case, 8M OP) to achieve a funding increase beyond ~14% on average.
This analysis evaluates a subtle form of attack that is particularly difficult to detect. Typically, malicious behavior in voting is defined as biased or altered voting, and countermeasures often focus on identifying suspicious voting patterns. However, what if a malicious actor convinces other voters to simply abstain from voting?
The Cost of Control is a metric within the GovXS Voting Design Evaluation Framework that measures the impact of adding or removing voters from a voting process. In this analysis, we focus specifically on removing voters, as Optimism Retro Funding has implemented significant measures to make it difficult for attackers to "add" voters to a funding round.
By contrast, removing voters seems more feasible. Badgeholders who regularly participate in Retro Funding voting are not obligated to vote in every round. Additionally, voters can and may need to abstain from voting due to a self-reported conflict of interest.
To evaluate voting designs within Retro Funding, we explore how an attacker could influence funding outcomes by selectively removing voters—for example, persuading them to abstain. We measure the maximum funding increase an attacker could achieve by reducing the participation of certain voters. Again, the chart below is based on randomly created votes (see section “Experimental Design”).
Our comparison based on randomized voting data shows that R1 Quadratic and R2 Mean are the hardest to attack. The R3 and R4 median rules are clearly more vulnerable to attacks where voters are persuaded to abstain.
To put this comparison in perspective, we’ve run the simulation using the real votes in Retro Funding Round 4 again. The results indicate that low numbers of voters need to be removed to achieve a significant increase in funding. In the real R4 voting, an attacker with complete knowledge of the voting dynamics could have convinced just 10 voters to abstain, leading to a 30% increase in funding for their preferred project.
These findings confirm our initial observation:
Median-based rules (R3 and R4) are more vulnerable to control attacks. An attacker can strategically reduce the participation of select voters to skew the funding allocation.
The results suggest that Retro Funding would benefit from additional safeguards. Possible measures include aligning voter incentives, which we will discuss in our final Part 4 article, “Optimizing Retro Funding Towards an Objective Truth in Funding Allocation.” (to be published in November 2024)
In the previous sections, we discussed how resistance to malicious behavior varies depending on the selected voting design. However, to fully understand the sensitivity of Retro Funding, it is essential to consider the specific setup of each round. This includes factors such as the expected number of participating projects, the number of eligible voters, and the total funding available in the Retro Funding round (the round size).
Our analysis of Voter Extractable Value (VEV) has revealed correlations, such as:
VEV is positively correlated with the round size.
VEV negatively correlates with the number of voters and projects participating in the round.
To assess the resistance to malicious behavior in a particular Retro Funding round, it is crucial to run simulations using the actual voting parameters or, at the very least, the best estimates available. This approach ensures a comprehensive understanding of the funds at risk and the dynamics at play, round by round.
GovXS’s Evaluation Framework revealed that individual attacks on Retro Funding by a single voter are feasible and can be profitable. This risk increases when multiple voters collude, attacking the funding mechanism collectively. We assess a voting design's resistance to such coordinated attacks by examining its Group-Strategyproofness.
Group-Strategyproofness is defined as a property of a voting design where no group of voters can improve their collective outcome by misrepresenting their preferences, even if they act in coordination. Achieving this ensures that the voting mechanism remains fair, even under the threat of collusion.
However, none of the voting designs we evaluated are group-strategyproof while maintaining the essential properties needed for Retro Funding. This presents a complex problem that remains unsolved and requires further research to address effectively.
Given these findings, it is crucial to align the incentives of voters with those of the Optimism Collective. Continued development of Retro Funding is needed to uphold its core vision: "Impact = Profit." This alignment would mitigate the risks posed by collusion and strengthen the resilience of the funding mechanism.
GovXS’s Voting Design Evaluation Framework allows the analysis of the resilience of voting mechanisms to malicious behavior. By applying principles from Social Choice Theory, we provide a structured and rigorous analysis of voting rules' effectiveness against various forms of manipulation.
We evaluated several key metrics to determine the robustness of each design:
Maximum Extractable Value and Voter Extractable Value (VEV) - Measures the extent to which a malicious voter could extract funding.
Robustness - Examines the sensitivity of voting outcomes to changes in individual votes, indicating how sensitively a voting rule reflects changes.
Cost of Bribery - Evaluates the effort or expense required for an attacker to increase funding outcomes through vote manipulation.
Cost of Control - Investigates the impact of selectively removing voters from the process and the resulting increase in funding for the attacker's favorite project.
Additionally, we assessed Group-Strategyproofness and analyzed the designs' resistance to collusion, assessing whether coordinated manipulation could improve a group's collective outcome.
Our findings show that Quadratic Voting (R1) is the most resilient to attacks across all metrics we’ve assessed. Median-based rules (R3, R4), while offering some protection through quorum requirements and caps, remain vulnerable to control attacks. The Mean Rule (R2), though resistant to some attack types, displayed high sensitivity to slight vote changes, affecting its overall robustness. Finally, none of the voting rules assessed satisfied Group-Strategyproofness, meaning a group of actors can only achieve their best result by voting according to their true preferences.
These design flaws affect Optimism Retro Funding’s resistance to malicious behavior, and hurt finding the objective truth in Retro Funding, where “Impact = Profit”, which we explore in Part 4 of our evaluation “Optimizing Retro Funding Towards an Objective Truth in Funding Allocation.” (to be published in November 2024)
Figure 1:* Resistance to Malicious Behavior, Evaluation Results Figure 2: Incentive Compatibility, Evaluation Results, see GovXS (2024) A Social Choice Analysis of Retroactive Funding Figure 3: Maximum Extractable Value, scenarios Figure 4: Voter Extractable Value / Voter type: Mallows Model / Simulation rounds: 30 / Round size: 8M OP / projects: 63 / voters: 40 / quorum R3: 17 / min. funding R3: 0 OP / min. funding R4: 1000 OP / max. funding R4: 500K OP Figure 5: VEV in Retro Funding Round 4 (in OP tokens) / Voter type: Retro Funding round 4 voting matrix / Simulation rounds: 100 / Round size: 10M OP / projects: 230 / voters: 108 / quorum R3: 17 / min. funding R3: 0 OP / min. funding R4: 1000 OP / max. funding R4: 500K OP Figure 6: Robustness / Voter type: Mallows Model / Simulation rounds: 100 / Round size: 8M OP / projects: 63 / voters: 40 / quorum R3: 17 / min. funding R3: 0 OP / min. funding R4: 1000 OP / max. funding R4: 500K OP Figure 7: Cost of Bribery / Voter type: Mallows Model / Simulation rounds: 100 / Round size: 8M OP / projects: 63 / voters: 40 / quorum R3: 17 / min. funding R3: 0 OP / min. funding R4: 1000 OP / max. funding R4: 500K OP Figure 8: Cost of Control / Voter type: Mallows Model / Simulation rounds: 100 / Round size: 8M OP / projects: 63 / voters: 40 / quorum R3: 17 / min. funding R3: 0 OP / min. funding R4: 1000 OP / max. funding R4: 500K OP Figure 9: Cost of Control in Retro Funding Round 4 / Voter type: Retro Funding round 4 voting matrix / Simulation rounds: 40 / Round size: 10M OP / projects: 230 / voters: 108 / quorum R3: 17 / min. funding R3: 0 OP / min. funding R4: 1000 OP / max. funding R4: 500K OP*
Kicked off with Optimism, the GovXS Voting Design Evaluation Framework is a tool to secure robustness, fairness and trust in retro Funding Voting Systems across all ecosystems. It covers design objectives like Resistance to Malicious Behavior, Incentive Compatibility, Simplicity for Voters, and more.
The open-source framework enables the prioritization of design goals and analyzes a voting design with formal rigor, applying axiomatic analysis and agent-based simulations.
Resources:
GovXS Presents: Evaluating Voting Designs for Optimism Retro Funding (Workshop with Optimism Badgeholders)
GovXS Evaluating-Voting-Design-Tradeoffs-for-Retro-Funding (Open-source simulation framework to measure how different voting designs perform against a number of typical retro funding design goals)
A Social Choice Analysis of Retroactive Funding (Paper, formal specifications)
GovXS is a research initiative under Token Engineering Academy. Team members include Nimrod Talmon, PhD, Angela Kreitenweis, Eyal Briman, and Muhammad Idrees. GovXS is a member of the Token Engineering Academy Applied Research Network. Sign up to receive further updates related to GovXS.