Reputation serves as a key form of trust and coordination signal across various communities.
Arguably, much of the success of Web2 is due to the reputation systems embedded in Web2 products (e.g. Uber driver ratings, AirBnB host ratings, eBay reviews). These ratings provided crucial trust signals on Web2 platforms and marketplaces, enabling effective coordination of service discovery and delivery.
The challenge with Web2 reputation systems is that they are controlled by the platform. As in many single-middleman situations, these reputation systems oftentimes are subject to serve the interest of the platform but are not optimized for the interests of the users. For example, millions are excluded from financial systems for lacking mainstream credit history. Ratings earned on a single platform are not recognized by other platforms. Paid listings are prioritized despite their true reputation. Not to mention the inherent lack of portability and composability where your reputation is fragmented and locked-in on proprietary systems. A 3rd party can alter or revoke your reputation credentials and strip away control from users.
As we seek to move away from flawed centralized systems, it's important to create resilient, permissionless, and decentralized reputation systems.
When embarking on the development of reputation systems within the paradigm of decentralization, verifiability and resilience, it becomes paramount to establish the correct design principles. These principles will serve as the foundation for shaping the system's architecture, product, and user experience, ultimately guaranteeing its ability to deliver on its promises.
Based on Karma3 Labs’ work with pioneering decentralized communities, we outline foundational principles for a reputation system that work in the open and decentralized context of Web3.
The high-level summaries of these design principles are:
Reputation is inherently social, built on peer-to-peer trust. In a community, as people interact with each other, they start to generate various trust signals for each other, such as a FOLLOW or a UPVOTE on a social platform.
Reputation is a macro property that emerges from the micro-interactions between agents in a complex system. Reputation starts to emerge and evolve based on the peer-to-peer social trust signals for each participant in a community.
Reputation systems require intentional curation as an organic evolutionary process. Designing and accumulating desirable trust signals are essential curation exercises. The data primitives, such as attestation schemas, need to be defined to capture these trust signals for reputation computation.
A reputation system needs to have an intentional bootstrapping phase. This phase models the desirable behavior in a community and sets up the initial social incentives in the community. Intentional curation by early participants is essential for building the right foundations for a decentralized reputation system.
The system's resilience should be rooted in its decentralized social nature, withstanding attacks without gatekeepers. As the number of users and reputation signals exponentially grow, a positive feedback loop reinforces the system to be increasingly capable of withstanding attacks, ensuring resilience for reconfigurability and incorporating permissionless innovation.
Reputation usually emerges from peer-to-peer (p2p) interactions. The origin of reputation is communal - a community of people interacting with each other, (un)intentionally developing trust signals. Once these social interactions grow in a community, each person develops a degree of trust for another, based on their own social graph. Eventually, this social, peer-to-peer reputation system becomes a way to coordinate and transact with each other.
For instance, even before centralized financial institutions existed, loosely organized credit communities operating in villages embodied the most raw and authentic version of communal trust. People in these communities would lend money to each other based on some sort of social capital or reputation. Fast forward to today, we still largely depend on our interpersonal and social networks to navigate both professional and personal choices. For example, we are more likely to hire a reference from a close friend or coworker, versus relying solely on a talent marketplace; we'll likely select a restaurant vouched for by a friend (who has good taste in food) versus just a Google search result.
While this peer-to-peer approach works well, it can be limited to our first and second-degree connections within a local community or context. Once the community size grows, it becomes difficult for everyone to interact or connect with each other. In this situation, it is impractical to form a communal consent of everyone’s reputation through one-to-one interactions among everyone. Instead, some individuals or entities start to accumulate delegated trust from the community. Communities start to rely on these trusted entities to assess and manage the reputation system for the community. However, such a top-down approach tends to become opaque over time. Often falling prey to misaligned incentives, corruption, and collusion.
Could a reputation system work without the constant purview of a centralized authority? Can we go back to the principles where reputation emerges from community interactions? With the recent innovations in web3 infrastructure, open algorithms, and verifiable compute systems, we can now build a resilient decentralized reputation system from bottom-up peer-to-peer signals.
💡 We no longer need these agencies of trust. Peer-to-peer signals are usable again given the new technology development, digitalization of social interactions, and a huge increase in computing power. Web 3 is also making data possible to assess outside of closed silos. Reputation emerges as a property of the interactions of agents in a complex social system. Now we can not just observe it we can compute and qualify that.
There are several benefits to this. The open data layer enabled by Web3 infrastructure eliminates our dependence on centralized gatekeepers who manage and monetize user social graphs. Reputation systems built using on-chain data can bring increased transparency, sovereignty, and censorship resistance, promoting a more democratic and trustworthy reputation evaluation system.
As an example, decentralized social platforms such as Lens and Farcaster, touted to be protocol layers for decentralized social apps, have open social graph data. With these openly verifiable data sources, one can build a trustworthiness metric by inference to the number of peer-to-peer interactions within these social communities.
In any community, whether it's social networks, DAOs, or subreddits, users are constantly interacting or transacting with each other. These peer-to-peer signals can be captured as peer-to-peer attestations. For example, when Alice 'follows' Bob on a social network or community, it is an attestation from Alice to Bob, potentially signaling that Alice is interested in Bob's content or activity. A collection of these attestations among users creates a social graph for the community. This social graph can be utilized for creating personalized reputation signals for each user, instead of relying on a centralized credential authority.
💡 The key intuition when calculating the trustworthiness of a person in a community is — a person's reputation is recursively calculated by how much others trust this person, weighted by those people's reputation.
There is a class of Pagerank-like algorithms – such as EigenTrust, published in 2003 – that help calculate the trustworthiness of users (or nodes) in a decentralized (peer-to-peer) system. The recursive matrix multiplication of an array of interpersonal trust scores converges into a principal eigenvector - which provides a global reputation score for every user in the system.
💡 This unique trust assessment is the final puzzle piece: It is how peer-to-peer, subjective trust is transformed into an objective reputation.
As an example, you trust your friends but because each person only has so many friends, it’s too limited to make a reliable system for millions of users/peers in a network. You can then expand that by asking your friends who they trust and weighing their opinions by how much you trust your friends. This, when applied to at scale using the linear algebra behind EigenTrust, you can initialize a trust vector (your network) with a set of seed peers (your friends) that you trust.
When you keep multiplying that vector by a matrix that represents the pairwise trust judgments of all the peers in the network. This is a power method algorithm, and it converges to the principal eigenvector of the matrix. Eventually, you get complete coverage over everyone connected to you, directly or indirectly — in just a single eigenvector calculation.
Reputation emerges as a result of the user interactions taking place within a social-economic complex adaptive system such as a community. For instance, when people upvote a post they like on a subreddit, the count of upvotes and downvotes plays a role in the rank of the post and the reputation of the author. Here, the upvote/downvote is a data schema to capture user intent or attestation towards a post and its author. Reddit spent considerable time curating the values and utility of the platform for its members. And all the user actions like upvoting, commenting, and giving karma points were there from the beginning for the community to coordinate. Over time, these high-value user actions became essential for the ranking and reputation of people and posts on Reddit.
💡 With the right rules for peer-to-peer interactions, reputation manifests as a property of a complex system, indicating the trustworthiness level of each agent. Since peer-to-peer interactions are inherently social, a reputation system emerges by itself, enabling effective coordination and collaboration.
This emergent reputation system gradually matures into a reliable trust management system. The more that community members interact with each other and signal trust, the more efficacy the reputation system demonstrates.
💡 Reputation is a macro property that emerges from the micro-interactions among participants in the community.
Ironically, communities often want to benefit from reputation systems from day one without laying the foundation of capturing user interactions at the micro level. The capturing starts with designing the appropriate data schemas and user interactions most useful for the community to coordinate and find value. This could be a 'like' button, an 'invite code', or a 'follow/subscribe’ feature. Or it could be a common on-chain transaction or asset among community members. Eventually, the reputation signals come from these data schemas and user actions.
It's also important to note that the quality of these user actions and data schemas is critical, especially in the web3 context. For example, for an NFT marketplace or DEX, if the volume of transactions includes washed trading, it's faulty to use transaction volume as a signal to indicate the reputation or value of the assets or marketplace. But if there are transactions around which users bought which NFTs, these schemas may enable more valuable ranking and reputation for users, NFTs, and assets.
💡 Peer-to-Peer attestation and credentialing tools are available for developers to use for building schemas. However, the choice of data schemas, data structures, validation, and sequence of data collection via user interfaces are equally if not more essential to get the desired output from these attestation schemas.
User attestations should be verifiable to make sure that the data input and fidelity are not corrupted. Community governance helps in keeping checks and balances around data schemas, data authenticity, and reputation computation. Additionally, a reputation system can be easily forked in case the community doesn't achieve consensus on any of these parameters. This can lead to multiple types of reputation systems for the same community. Privacy-preserving designs that enable anonymity or confidentiality for attestations and reputation computing can also be used in designing the attestation and compute layers.
The appropriate reputation signals shouldn’t be hard coded. In fact, it takes time for communities to understand what trust signals shall be captured, and from a technical standpoint what kind of data schema shall be used. Take upvoting a comment on Reddit for example, it is not always clear if it captures a sentiment of agreement or that of approving the quality and contribution of the post. Additionally, communities from different verticals would also need to design such differently, for social media platforms are obviously different from a service rating platform, or governance forum of a community, when it comes to reputation signals. Communities should be able to use different data schemas to signal different kinds of sentiments and the designing and discovering of such shall be an iterative process.
Curation means that there shall be intentional design to drive certain behaviors within a community, organically. This helps cultivate a shared understanding of goals and culture in a community. As trust signals grow qualitatively and quantitatively, the reputation system becomes stronger and more relevant.
Evolution implies that the reputation system shall be built over time, taking an ecological approach instead of a pure engineering approach. Each phase can be distinct in its mandate and lays the foundation for the next one, embodying different complexities in reputation signals, algorithms, incentive design, and level of openness and decentralization.
When it comes to surfacing the right complexity level of trust signals, it is also important to keep it iterative. Simple and singular designs can be easily understood by participants and gain faster and greater adoption, however, they risk directing the players to optimize towards it to game it. On the other hand, complex designs can get people lost in the weeds easily. A fine-tuned balance is the result of iteration based on community responses rather than a finite perfect design at the beginning.
In addition to curating reputation data schemas to capture user intent, the context of reputation must also be captured and curated. When we seek the 'reputation' of a person, it is incomplete without context. For instance, a person's reputation as an Uber driver will differ from their reputation as a programmer. For reputation systems to work, they have to be context-specific.
The context informs which qualities or trust signals could lead to a higher reputation score. For example, only if a good programmer A rates another programmer B highly, it will lead to a high reputation score for B. In EigenTrust, this peer-to-peer trust, molded in a specific context, is termed local trust (“local” because the trust is always anchored at a specific peer).
However, peer-to-peer local trust alone is not enough to define the context for reputation calculation. In the context of a community reputation system, often a group of initial members will define the purpose, values, and culture of the community, which gives context to reputable contributions and entrusted behaviors, and they propagate such desired alignment to members joining on. These initial reputable and trusted members can be selected as seed peers in the EigenTrust reputation computation. This is useful not only for factoring context into the reputation calculation but also for Sybil resistance and bootstrapping, which we will dive deeper later.
To bootstrap a reputation system for a community, the early community members play a crucial role. They help curate the values, culture, tone, and desired communal behavior protocol for the community. For instance, the first few power users on a social network; or the first 100 subreddits - all help define the direction of how people engage and find these platforms useful. In the beginning, a community can form based on token gating or credential filtering, but over time, early community members and their interactions will determine the direction and magnitude of the community.
💡 When Eigentrust is used as a decentralized trust system in a community, you start with a few seed peers. These seed peers can be determined using token or credential filtering or it can be a diverse group among the early participants, but likely not a sybil. Without this deliberate bootstrapping phase, a community or network can easily ttract spam and Sybil actors, misaligning the purpose and utility for its desired users.
In EigenTrust, these seed peers can be changed at any point of time, giving enough elasticity to the reputation system. Additionally, the influence level of the seed peers, known as the alpha value, can be set anywhere between 0 and 1. These configurable bootstrapping parameters in EigenTrust ensure that the reputation system remains dynamic and responsive.
One challenge to overcome is how to bootstrap the reputation of newcomers who join without any prior history or actions. In some way, how to avoid punishing them because they are noobs. Just like when someone joins a new social network, how do they grow their following or reputation when no one follows them.
One way to address this for new users is by porting over their reputation from another community. This can help bootstrap an initial user reputation score in this new community. Another simple solution is to offer newcomers a grace period, during which they won't be judged or ranked. Instead, they will have the opportunity to shine and build trust among their peers. We see similar practices when applying for new credit cards, or looking for housing in new neighborhoods, and even across decentralized autonomous organizations (DAOs).
Another aspect to keep in mind is how to incentivize positive contributions among new users. How a community deals with the quality vs. quantity in user participation becomes essential. The data schemas (local trust) should allow users to earn a reputation in the community in the desired way. Over time, new users who bring the most value to the community can also help guide the moderation and evolution of these reputation behaviors.
By employing some of these strategies, we can foster a more inclusive and open environment within a community. As newcomers gain trust and contribute positively to the community, their reputation will naturally grow, further strengthening the decentralized trust ecosystem.
During the initial phase of activity in a community or network, showing a reputation signal to the user before it becomes valuable or well-understood across the community can create unconscious biases. It can prevent natural interactions within the community or create undesirable clusters. For example, what if a social network only showed posts or feeds based on the most followed users.
Similarly, when browsing an app store, a user might get discouraged from a new product or app with a low rating even though only a handful of reviews are behind the rating. If there was a cutoff or grace period in ranking or showcasing the reputation of applications, it would enable fair and revealed user preferences.
As a result, many review and ranking systems avoid showing ratings or scores before reaching a threshold number of signals with meaningful sample sizes. From a UI/UX standpoint, it becomes important to introduce the apt reputation signals at the right time and format. The way reputation signals like the number of ratings, badges, leaderboards are displayed should not create bias to act a certain way.
The key value of a reputation system is to aid in search and discovery within a community or network. As the bootstrap phase matures, users start relying on rankings and recommendations to find what they are looking for in a community or marketplace. Most of the recommendation algorithms in use today are aimed at making users pick something rapidly and stick to it for as long as possible.
Such motivation can result in users being trapped in clusters and local optima. Even on some popular search and social apps, people find repetitive choices keep surfacing back, trapping us in our own echo chambers. For a decentralized reputation system, the algorithm or compute layer enables weighing actual peer-to-peer trust in giving ranking and recommendation results.
Previous discussions in the space often overly focus on economic incentives for desired behaviors, yet utility, and social value incentives are often overlooked. However, communities have complex responses to incentives. Even in centralized systems, Yelp product teams were surprised that the quality of restaurant views dropped significantly when they experimented with introducing economic incentives — it diluted the community of review champions who were primarily driven by social.
💡 Given the social and communal nature of reputation in a decentralized social system consisting of p2p interactions, a sustainable reputation system needs to harness the social capital dynamics of altruism, communal responsibility, and rewarding positive contributions in the community to bootstrap vigorous attestations in the early stage. values.
There could be four phases of incentivizing peer-to-peer trust signaling with the first three phases emphasizing social values:
Altruism: bootstrapping the standard and desired behavior by individuals who understand the importance and view it as a public good
Community: The community takes responsibility for cultivating a robust reputation system based on quality reviews as a public good
Fame: internal (community) and external (outside of the community) recognition of individuals, linking good work of quality reviews to other liquid social assets
Fortune: Economic incentives to play the long game
On top of social and economic values, the system will make sure there is a utility for users to signal trust and to understand that active participation in signaling trust is beneficial in the long term. According to game theory, repetitive games versus one-off games are more likely to result in the collaboration of the players. Taking the effort to signal trust, which is typically an extra step when interacting with a system, needs to be motivated by both a short-term self-interested utility, such as a vote on a decision, and a long-term utility, such as leaving a review for an Airbnb review — you know you will be renting a place frequently so you are interested in having trust-worthy reviews in the system.
Another property that becomes important as the reputation system matures is its ability to withstand attacks - Sybil and collusion among many others, in a permissionless way. While curating reputation signals throughout the phases, we also need to build system resilience over time. The ability to modify schemas, access reputation data, make compute decentralized or verifiable, and add transparent moderation policies ensure that the system is resilient.
In a permissionless and decentralized system, it is crucial to have robust mechanisms to withstand malicious attacks and exploits. For instance, 1) Sybil actors attempting to manipulate reputation, 2) Untruthful attestations that provide dishonest signals for personal gains, and 3) Attempts to game the system and mine reputation signals, thereby distorting the effectiveness of reputation signals.
A common approach in both web2 and web3 is to identify Sybils through pattern recognition based on their behaviors. Organizations try to detect fraudulent activities and train machine learning algorithms to identify sybils. However, this becomes a Whack-A-Mole game, requiring continuous updates on the ML algorithms as Sybil actors find ways to evade detection. The effort needed to identify and protect against Sybils increases linearly or even exponentially over time. Safeguarding the integrity of the reputation system becomes increasingly expensive.
While ChatGPT challenges this notion, humans are still capable of distinguishing between sybils and honest actors through social interactions. Sybils usually tend to form distinct social clusters when interacting, which becomes apparent when visualizing these interactions as connections on a social graph. Lately, this is becoming increasingly evident on Twitter feeds. Algorithms that calculate reputation scores based on social graph data, can leverage this intrinsic property of social networks to minimize the impact of Sybils. Since it becomes increasingly difficult for Sybils to form dense interaction networks with real humans, even though they form their own Sybil clusters, their reputation scores remain low, rendering their signals less influential.
💡 Conversely, social interactions inherently provide sybil resistance.
As a reputation system matures and reaches a steady state, two things emerge:
peer-to-peer reputation signaling activities of users generate a wealth of data, forming a well-connected network graph, and
reputation scores or trust scores can be calculated for every user in the network. The EigenTrust algorithm weighs the reputation signal from user A to user B based on the reputation of agent A.
Sybils would have low reputation scores since they cannot receive trust signals from highly reputable participants. Consequently, if user A is a Sybil with a low reputation score, whatever they say about user B's reputation does not significantly influence the overall reputation score.
Additionally, an implementation of personalized EigenTrust for each user enables a personalized reputation score for each user. This significantly raises the difficulty level for rigging the reputation assessment. In this scenario, each user's trust graph is centered around and unique to them.
💡 Rigging the system becomes exceptionally challenging and offers minimal rewards, reminiscent of the situation portrayed in the Black Mirror episode "Crocodile", the Sybils would have to change the reputation reality for every single user.
The sleeper agent attack, which is rather difficult to capture or detect, until it happens, can have a few mitigation strategies in the context of EigenTrust.
You can introduce a high economic or social cost of an attack for a sleeper agent, such as a significant loss of social capital for a highly reputable user to turn malicious. This can also be combined with attaching a user's identity or other valuable assets, the loss of which becomes a greater disincentive versus attacking the system. Another solution is to employ counter-designs. Time decay, for example, may be implemented, wherein reputation may only be maintained by continuously receiving trust signals.
A successful peer-to-peer reputation system is evidently high-impact because reputation guides the human decision process. It has an amplifying effect. As such, when we design such a system, we must consider how to ensure its continued success. How do we ensure the computation is not controlled by a central entity or has hard-coded heuristics? How do we ensure we can trust the computation results? How do we ensure the reputation system is scalable?
💡 The system must be permissionless, decentralized, and openly verifiable.
Permissionless means anyone can participate in the system in any role, without arbitrary gate-keeping. Any developer can create their own reputation system based on on-chain or verifiable off-chain data.
Decentralized means there is no single party capable of steering the direction and fate of the reputation system, decoupled from and against the collective will of the users of the system.
Verifiable means that the correctness of the computed results can be easily verifiable. It removes trust assumptions, for example, the user of a score won’t have to place trust in (goodwill and correct operation of) those who computed the score.
This essay is not only a highlight of our learnings around the design principles of an effective reputation system in the making, but also an invitation to folks interested in Sybil-resistant identity and reputation systems, to join us in building a robust and open reputation layer for internet and on-chain communities.
A few more design considerations can help add robustness to the reputation system.
Peer-to-peer attestations should be verifiable. This ensures that the data input and fidelity are not corrupted.
A governance system should be in place. It helps in curating and evolving the algorithms and performing updates on key parameters based on community consensus.
Forking should be possible for competition and settlement of unachievable consent.
Last but not least, privacy-preserving designs can be considered for required contexts.
Our effort at Karma3 Labs is not a lone journey towards the mission of enabling on-chain reputation. We have been learning from all the examples and successes from our recent exciting partnerships with Lens Protocol in Web3 Social, in a decentralized App Store-like review platform, and in NFT ranking use cases. We are so excited to be a part of the Web3 development frontier, where reputation and trust are the pivotal keys to unlocking the potential of equitable communities to make a difference.