‘How do we know it Works’?

Old and New Problems with Measuring Impact (And how Web3 might solve them)

Did ‘the good’ we tried to do, do any good? On the back of much criticism, this question drove not-for-profits, governments and institutions for much of the last half-century. Now, Crypto enters the scene, and along with regen, brings an explosion of DAO activity to the idea of funding public goods. So how can Web3 learn from the mistakes of the past, and measure impact effectively? Let’s find out.

Charity, Aid, ‘doing good’ or now, ‘Impact’. Whatever we call organised altruism in human societies, it’s been core to our idea of community, for thousands of years.

It’s scaled to become industrial. Global philanthropy is valued at $2.1 trillion per annum (say citibank). Yet, Aid is beset: misuse of funds, accountability failure, and top-down decision-making, to name but a few. What we call Impact, seems ripe for disruption.

So what is it? Impact usually arrives where need meets altruism, in the form of NGOs and Community Based Organisations, who fill gaps where government provision falls short, or perhaps where human catastrophes or conflicts have happened.

Typical Impact administration models include government funded, private or philanthropic, and charitable foundations. Private sector giving, corporate social responsibility (CSR; which is funding or business activity, socially administered, yet aligned with business objectives) has become more common over recent years, whilst local groups are increasingly seen as the most sustainable delivery method, and tend to be the most responsive, due to their deeper levels of knowledge at community level.

Over the last 40 years Impact has changed dramatically. From the Bretton Woods System, to massive international charities, the Millennium Development Goals, CSR, SDGs, and now ESG. Money has shifted from cause to campaign, often aided by political trends: saviourism (inter-war years to 1950s), reconstruction and anti-communism (1945 to 1980s), famine in Africa (1980s), HIV/AIDS (1990s to early 2000s), to climate change prevention and adaptation (now).

Blockchain technology arrives as an obvious candidate to democratise the architecture that underlies global Impact. Powered by potential for hyper-local, transparent and frictionless governance, smart contracts offer the possibility of programmable funding for projects. Also, blockchains create means by which we can fairly price undervalued public goods, meaning communities can be supported to do more good things for the planet, and companies can be incentivised to stop doing the opposite.

Yet any excitement around improvements to philanthropy’s substrate should not obscure Impact’s complexity. Improved tooling won’t guarantee improved outcomes, and some Web3 projects know this, tapping into older expertise and seemingly unfashionable industry knowledge.

We as former/current Impact industry professionals (including one PhD in that field), and regenerative crypto economics enthusiasts, were drawn to these questions also. We wondered: what’s under represented here? One key challenge within the existing ecosystem of Impact we noticed, that blockchains will inherit, is ‘measurement’: how do we know what we’re doing is impactful?

We wanted to make a humble contribution to existing efforts  - such as those at Gitcoin - to answer this question, by providing a brief synopsis of some of the continuing challenges that impact measurement faces in the Aid sector, and suggesting avenues for Web3 to build out from and create coordination mechanisms for. We hope any condensed value-added from our thoughts here, can be residual, or institutional memory, so that smarter Web3 folks building governance and funding systems communally in regen may be better equipped, with these problems in mind.

In the first part we set the scene, look at DAOs, and some examples of their contributions to impact. We then introduce some of the historical failures of industrial impact, taking three levels of analysis (meta, macro and micro), examining one limited example from each (which we acknowledge is insufficient). Dividing into strata like this is problematic in Web3, as the ideal type of regenerative crypto-economics assumes the dissolution of these levels. Nonetheless, we include them so that the problems we identify in each level can be itemised contemporaneously. And in an attempt to move beyond taxonomies, at the end of each level, we suggest avenues for further solution-discovery for builders and community members in Web3 to work beyond types to forms of regen DeSoc, which distribute power more equitably in governance. Fortunately, as we can see in various thought-trajectories, there are already solutions that are being worked on that could be readily applied to these issues.

Part 1: How is Web3 approaching Impact?

There are movements, such as regen, which coalesce various societal communities within and beyond blockchains (such as solarpunk) towards planetary regeneration and the feeding of public goods. There are also numerous DeFi and NFT projects that hive a portion of their profit off for good causes.

However, in terms of function, when we think about Web3 and impact, we think about three tendencies that development seems to be leaning towards. These are: coordination, funding, and crucially, proof.

1. Better Coordination: Impact DAOs

Most people will know Web3 as cryptocurrencies, NFTs, and perhaps DeFi. But Web3’s first wave of democratisation towards governance came in the form of DAOs. A DAO is an organisation without central authority. Instead the organisation is governed from the bottom-up by a decentralised body of users who make decisions on proposals. To propose and decide, in this model, is ideally on-chain but as a minimum publicly. The number of DAOs is exploding, a phenomenon we explored earlier this year.

DAOs that focus on activities which were once the domain of Aid and philanthropy have been called  ‘Impact DAOs’; the accepted definition being ‘a DAO that creates net positive externalities to the ecosystem around it’. Impact DAOs offer a Schelling point for those interested in a specific cause to pool efforts and resources in order to power a solution. Gold standard examples of Impact DAOs are: Proof of Humanity, Celo and CoordinApe. Gitcoin has published a book on Impact DAOs; a great resource for further reading.

2. Better Funding: Quadratic Funding

Web3 technology provides an improved substrate for novel yet effective decision mechanics to have their moment. Quadratic Funding, based upon quadratic voting, is a method of matching-fund allocation that when properly discounted, rewards projects with the highest number of contributors, rather than those with the most contributed. The effect of this is greater plurality, and wider distribution of power, away from those with the deepest pockets. The current gold standard of QF in action is Gitcoin Grants.

3. Better Proof: Impact Certificates

‘Doing good’ confers social status; it’s a major driver behind philanthropy, and sees its latest iteration in the now widely used concept of ‘purpose’. Yet, an absentee has been proof that you have contributed to ‘doing good’. Web3 could allow for collections of  ‘Impact Certificates’ or proofs that someone contributed (either via contributing labor or finances) to Impact. These certificates would be owned by you, (some) non-transferrable, and displayed in wallets to denote history, or right to participate in other Impact activities. Some of these certificates could be NFTs, which creates the possibility of certificates being traded, leading to Impact Markets. A thriving marketplace for impact certificates (provable impact) would provide an improved incentive mechanism to fund more people to ‘do good’. Impact Certificates are in their infancy. However projects like Funding the Commons are stress-testing them.

Part 2: Old Problems and New Solutions?

We’ve briefly introduced the history of Impact above, and looked at some of the tendencies Web3 seems to be building towards. The last - proof - is especially curious, because in its current iteration in Web3 projects, proof denotes whether a contributor has funded impact, but not whether the said impact was demonstrably and verifiably impactful. With this in mind, we sketch out three levels (meta, macro, micro) in which Industrial Aid has found challenges, and suggest one issue area within each level that we propose Web3 developers and community builders to keep in mind when creating.

It’s important to note that the Impact Industry has soul-searched, analyzed its failings, and subjected itself to remarkable levels of self-criticism. At times, this has felt far more reflective than we would expect to see in public and private sectors. They have done a huge amount in a quest to be better, and continue to do so, to evolve. This has led to a number of measurement and process innovations within the last few decades, including:

  • Emphasizing “hyper-local” funding solutions which flow direct to grassroots organisations.
  • INGOs setting up local offices in the Global South, and being less top-down.
  • Frameworks that capture complex social change processes, like Theories of Change.
  • Outcome-led funding allocation (funding released upon achievement of Impact Outcomes).
  • Incorporating scientific methods - such as the Randomised Control Trial (RCT) - to understand cause and effect.
  • Logical Frameworks (Logframes), to clearly link, and seperate, programme inputs, outputs, and outcomes.
  • Increased datarisation and validation, through investments in Monitoring & Evaluation, and using data gathered to tell stories about interventions and how they worked.
  • Open-source toolkits for CBOs and NGOs to get started with data and analysis.
Upshot, 2021
Upshot, 2021

Yet despite this, we still inhabit a world in which the industry is set against the kind of derogatory statements that profiles like Elon Musk, academics, and other organisations regularly make. The point is not that any of these critiques are correct; it is that they can persist. Or that the performance of the Impact Industry has not been enough to marginalise these voices.

So now we have understood something of the context for Impact, let’s zoom into some of these problem areas. Lets ask: why is it so hard to measure impact?

PROBLEM 1 - META: POWER ASYMMETRIES

At the Meta level, we are trying to understand problems as they relate to themselves. Another way of saying this is to say, we are looking at the fundamental level of the challenge, beyond  superficial characteristics, to the basic rules of the game.

Historically, this has meant that ‘meta’ might refer to the actual way power works in a system, even if most of the time that power is only obliquely visible. Impact measurement in the Aid industry has been mired in issues power distribution since the beginning of formal decolonisation, in the second half of the Twentieth Century, and its replacement with a neocolonial system of debt, aid, and dictats for neoliberal reform.

Back in 2017, one of us wrote a book chapter and a thesis on one way in which power asymmetries manifest themselves in ‘development’ interventions (in that research, the kind of interventions that use sport). Although the work was specific to one type of ‘for good’ work by non-profits, the research drew from a 50+ year canon of Development Studies, Political Economy, and Sociology.

That corpus teaches us that attempts to ‘develop’ communities have power asymmetries baked in from the very beginning: aider/aidee, giver/receiver, funder/fundee, benefactor/beneficiary, knowledgeable/ignorant, advanced/backward, and so on. However, within preventative community-based programmes that worked on ‘behaviour change’ (e.g. addiction, reproductive health, employment, etc.) there were more specific, functionally weak power dynamics. These were based on biases in measurement, and definitional fluidity. Through a governmentality lens, these projects could be seen to rely on tenuous conceptual frameworks that sought to sustain these power asymmetries via rendering them ‘knowable’. They achieved this, tangentially, by artificially condensing a huge variety of human experience into microscopic instances where the non-profit spent time with the individuals they’re meant to serve. Another way of saying this is: non-profits sought to prove their impact by compacting social life, and all the complex factors that lead to a change in a community, and attributing those changes almost exclusively to their own work.

Whether that attribution was fair or not, was not the question. None of this suggests any malpractice at all. It’s just that to pull off this feat of justifying the funding that non-profits were receiving (itself a power asymmetry) the organisations had to condense a wide spectrum of possible causation into a few valuable moments that they shared with the beneficiaries, and codify this in impact measurement.

A myriad of software platforms sprung up around them to support. This helped these organisations turn all these loosely connected experiences into causation, which then got entered into databases, which then demonstrated impact, and sustained funding. Larger charities with greater economies of scale developed advantages in a competitive funding landscape throughout the early 2000s. Some organisations, especially smaller ones, became locked into a system of verification that they were not involved in the design of. As one participant from South Africa, heard at a European Development conference put it: “measure me on my own terms not the World Bank's”.

None of this discounts the fact that there are great organisations - in most cases - trying to grapple with difficult problems and creating a positive impact for the communities they serve. Yet unquestionably, they were mechanical in the sustenance of an unfair distribution of power, and the Aid industry has found this very difficult to reconcile, despite its best efforts.

Web3 vs Impact Measurement at the Meta Level:

Crypto has huge potential to democratise stakeholders and convert them into equity partners within a project. In so doing, they can create new, potentially fairer power dynamics. But Impact DAOs formed with the principles of regenerative cryptoeconomics have additional advantages to solve situations related to how power is distributed. First of all we can (through POAPS, SBTs, or other mechanisms to ensure participation is ‘bottom-up’) democratise governance of impact funding. This answers a crucial question that Impact measurement has found it difficult to solve and take to scale: local ownership. Wedded to this ownership is self sovereign control over one's identity (imagine refugees who own their digital identity stored on blockchains, rather than on papers for example, and it’s up to them when, how and what to doxx). With this in mind, we can imagine a way in which a participant in an impact programme is at least equal with a contributor (or funder) by means of the protection of their digital identity with a wallet address on-chain. This also opens the door to markets in which elements of participants’ data can be shared without revealing anyone’s identity, for the purposes of monitoring and evaluation of programmes.

An additional measure to ensure diverse ownership Quadratic Funding (QF), discounted to ensure plurality - in the manner that Ohlhaver, Weyl and Buterin suggest might work in Decentralised Society (DeSoc) - is a means by which we can ensure the largest actors no longer hold the greatest sway in decision making.

Avenues for further discovery or widened application at the meta level:

  • Plurality
  • Bottom Up approach
  • Self Sovereign identity
  • QF
  • SBTs/POAPs

PROBLEM 2 - MACRO: UNIT OF ANALYSIS

It’s one thing to ask, ‘what worked?’ It’s quite another to ask the question ‘what do we measure, to understand if something worked or not?’ This is the Unit of Analysis problem; what to measure; what to pay for?

Let’s imagine we’re trying to improve access to water in a community in Arizona. We have the building contractor working to bore the well. We have the government sanitation department who are tasked with regulating it, and the community who are likely to both use it, and maintain it. What do we measure, and thus pay for, if we want to know we are improving access to water? The awarding of the well contract, the boring of the hole, the completion of the project, or the community adopting it? What if we decide on any one of these, but no one actually ends up using the well (this happened, elsewhere btw, many times)?

Historically, what to measure (and where to lay fault when things don’t work out) has been a thorn in the side of attempts to demonstrate project efficacy. As such, some focus has shifted to measuring outcomes and sustainability long term, sometimes ten years after project inception, when community usage of the resource is shown to be viable long-term viable.

This is a point made by the Brookings Institute who say:

“Every aid agency faces this dilemma. They might do an evaluation at the project level to provide information on whether they are doing things right, but they have a great deal of difficulty knowing whether they are doing the right things to achieve the objectives they have set out for themselves. Absent that, aid agencies cannot learn and improve and for a long time most aid failures and successes were attributed to recipient countries, as if aid agency performance was a minor factor.”

For too long, when the unit of analysis couldn’t be accurately pinpointed, local factors such as corruption were blamed, and not the project design. Measurement of what (and therefore design of what), in the quest to solve that problem became a real challenge.

Web3 vs Impact Measurement at the Macro Level:

The Unit of Analysis problem is a very real and likely inheritance which the funding of public goods via Web3 will need to solve for. It’s a problem you can see well understood by contributors in the first 19 minutes here. It’s also potentially immediate, in that it is a problem in the real-time cycle for impact, and yet developers and contributors should know that the solution from industrial Aid so far has been longer term evaluation cycles. Crypto effectively speeds up funding to projects, and investors could look for immediate proof of impact. If this is combined with a quest for scale, then we can see how this would provide problems for Impact DAOs and NGOs just looking to prove what they are doing really works long term (not necessarily for funding, but for the communities they serve). Long term vesting of incentives is one possible option to explore and iterate.

Retroactive funding (discussed here) of Impact provides a reliable incentive mechanism that, by design, encourages projects to concentrate on bringing long term value. For-profit organisations raise capital due to their promise of an exit. Retroactive funding creates the same possibility for a financial ‘exit’ for Impact organisations by paying for Impact already achieved. Web3 provides an improved substrate for this to operate on due to its programmable payments, with oracles speeding up and qualifying whether impact has happened, and if its time for an organisation to be paid. These solutions are in their infancy with experimentation taking place.

Also, perhaps a solution related to proximity can be worked towards (again, perhaps using SBTs or POAPs), which works on what might be called the most proximate community principle. This would ensure that those closest to the issue and most affected by it, decide what’s important to measure and pay for, and this remains on-chain.

Lastly, transparent terms and payments using smart contracts will allow questions of who did what, when, and to what end, to be more easily answered.

Avenues for further discovery or widened application at the macro level:

  • Vesting
  • Retroactive funding
  • Oracles
  • Most proximate
  • SBTs/POAPs

PROBLEM 3 - MICRO: ATTRIBUTION

Once something that we wanted happens, we then have the problem of knowing what caused it. This is the problem of attribution. Did the thing that we did result in the good stuff we wanted?

When you begin with an intention, and a positive change happens shortly afterwards, it can be tempting to think you made it happen. ‘It’s too much of a coincidence for this not to be down to what we did, right?’.

Wrong, often, unfortunately. Causality is extremely difficult to prove, even in seemingly, the most nailed-on examples. This is why in medicine, researchers are cautious about attributing effects to causes. Instead, more commonly, ‘associations’ are made (‘where X happens we often see Y also happening). This is sketched out in Bad Science by Ben Goldacre, who uses the example of the relationship between smoking tobacco and lung cancer, to illustrate that despite the likelihood of cigarettes causing cancer being accepted since the 1940s, the causal relationship between the two was not definitively (in part due to PR efforts by tobacco companies) established until the mid-1960s.

And the problem of attribution becomes even more punctuated in interventions designed for behaviour change. Socially, the problem of attribution is reducible to bias. The University of British Columbia breaks this bias down into two forms in the context of organisational change and management:

“The first is called the fundamental attribution error. This error is a tendency to underestimate the effects of external or situational causes of behavior and to overestimate the effects of internal or personal causes. Hence, when a major problem occurs within a certain department, we tend to blame people rather than events or situations.

The second error in attribution processes is generally called the self-serving bias. There is a tendency, not surprisingly, for individuals to attribute success on an event or project to their own actions while attributing failure to others. Hence, we often hear sales representatives saying, “I made the sale,” but “They stole the sale from me” rather than “I lost it.” These two biases in interpreting how we see the events around us help us understand why employees looking at the same event often see substantially different things.”

Any social intervention can be said to be vulnerable to these biases, and aid is no exception. To return to our earlier example, access to water resources, even in dry areas, can help prove the point. Wells have been successfully bored in desertified water-scarce areas, and new water sources tapped. At this stage, you’d be forgiven for thinking: job done. But in some cases, due to a variety of subjective forces (including a lack of local ownership and little consultation with communities) within a few months, wells go unused. The issue is that in design, the project biases towards completion of the well (accessing the water). What is under estimated yet critically important, is its sustainable use long term.

The attribution issue doesn't stop at bias. Any given intervention is surrounded by other constantly changing factors, like culture, economics, environment, and politics. These fluid elements can create conditions whereby the impacts of programmes designed to ‘support’ development become imperceptible. Contemporary degrees of connectedness and  macroeconomic interdependence, obscure the effect of local interventions. Price changes for example, can suddenly change due to macro conditions, and as they do, define the success or failure of a programme (take micro loans for entrepreneurship, for example). meaning that, at best, we get a camouflaging of any measurable impacts, and at worst, these glocal forces outweigh any benefits or costs.

To navigate these difficulties, some projects have relied upon subjective storytelling which, though rich and important for demonstrating change, can also be problematic. Human experience can at times feel like it’s being reduced to content-friendly marketing material, and projects risk creating the impression that stories are captured to satisfy funders.

For Impact DAOs, the challenge to overcome is to ensure we are incentivising outcomes that weren’t on their way anyway. If a community plants mangroves, and these sequester more carbon, has the planting happened because of an incentive from a DAO, or is it because of a resurgence in particular forms of community knowledge, or a governmental awareness campaign, or all?

One potential answer is it doesn’t matter. If we are funding outcomes that are going to happen anyway, at worst we may only be risking the likelihood of those outcomes happening elsewhere; even perhaps sustaining the positive outcome for longer. In the frame of this response, the outcome is the most important thing, not the intention, nor even the object of the funding. But if we are to evaluate what we did, and if it worked, we won’t know with any certainty unless we can attribute effect to cause. And this may problematise the accuracy of funding in the future, and risk higher opportunity costs.

Here’s some suggestions, based on this micro-problem of attribution, for how Web3 can approach these issues in the future.

Web3 vs Impact Measurement at the Micro Level:

Impact DAOs can work on the problem of attribution by involving local stakeholders in project governance at the outset. It’s a longstanding critique of the Aid industry, that they were too slow to let local stakeholders lead over the years. Creating mechanics whereby communities most affected by challenges can set project terms would be wise. Then, verifying the plurality of that group (perhaps via POAP or SBTs) and empowering them through governance mechanics to shape the definition of the public good the project is designed to feed, would be welcome for many Impact DAOs, especially when weighed against the history of the Aid industry.

Avenues for further discovery or widened application at the micro level:

  • Stakeholder approach
  • Plurality
  • SBTs/POAPs

CONCLUSION

Our intention here was to condense some knowledge from the older Impact space - what we have variously called the ‘Impact’, ‘Aid’, ‘philanthropy’, or ‘doing good’ industries - in the form of problem statements for impact measurement over recent years. It’s by no means substantive, or complete. Merely an aperture to 3 different problem areas at three different levels, which may provoke other discussion to improve Impact DAO effectiveness in the future. We hope though that there is some value-add in the building of regen’s institutional memory, so that smarter Web3 folks can communicate and build with these problems in mind. Our humble summary of some of the problems of the past, for the tech on the horizon to build in cognisance of.

A shorter version of this article appears in our newsletter, The Schelling Joint
We want to say a big Thank You to Preeti Shetty, CEO of UPSHOT and an Impact specialist, who gave input and guidance in the formation of this article.

Subscribe to Schelling Joint
Receive the latest updates directly to your inbox.
Verification
This entry has been permanently stored onchain and signed by its creator.