Arweave: Permanent Decentralized Storage

By: Raye Hadi

Table of Contents:

  1. Introduction

  2. Background

  3. Blockweave and SPoRA

  4. Synchronization Blocks

  5. The Permaweb

  6. Token Structure

  7. Conclusion

  8. Sources

Introduction

     Arweave is a data storage protocol that ensures data permanence for their users while prioritizing decentralization of the network.  I find this project interesting due to the epidemic of distrust in public information that is currently spreading throughout society. It’s amazing that in a time where everyone has more access to information than ever before, people can still feel uninformed. Media centralization, societal polarization, and rampant bias have skewed people’s perception of what was once trusted and reliable sources. Information from reputable impartial sources is becoming scarcer. The overwhelming centralization and monopolization of content distribution platforms has enabled the censorship and removal of unfavored information, regardless of its accuracy. As this epidemic worsens there is an increasing need for data storage outside the reach of centralized control. Arweave hopes to be the solution to this problem by providing a permanent tamper-proof history of all data past and present.

Background

     Arweave was founded by entrepreneur and computer scientist Sam Williams with four main goals:

  1. to archive real world data

  2. provide decentralized storage solutions for institutions

  3. support decentralized applications

  4. store cultural and historic information.

Sam Williams claims that by changing the way people perceive the past you change the actions that they take in the future. William’s aim for Arweave is to become a digital archive of incorruptible information, out of reach to any individual that comes along and tries to alter how people see the past or manipulate/remove information. Williams intends to decentralize every aspect of Arweave and grow its reach across the entire web3 and blockchain space. This is shown through partnerships with Kyve, Solana, Polkadot, Cosmos, Avalanche, as well as integration with Polygon to become Ethereum compatible. All these partners are heavyweights in either the L1 or L0 categories, which aligns with Arweave’s objectives of adoption and interoperability. Arweave has also secured the backing of the top crypto VC funds such as a16z and Coinbase Ventures. So, while many would still classify Arweave as a project in development, tailwinds such as these provide encouragement in Arweave’s ability to fulfill their ambitions.

Blockweave and SPoRA

     The major technological breakthrough for Arweave was their reimagination of the blockchain they call blockweave. As a system that aims to permanently store the entire history of information, a method of verifying blocks without requiring nodes to produce the entire chain every time a new block is appended was required. Blockweave is a blockchain derivative, but as the name suggests, instead of organizing blocks in a sequential chain, blocks are organized in what can best be described as a “weave”. This unique method of data structure stems from Arweave’s novel consensus mechanism known as Succinct Proofs of Random Access (SPoRA). SPoRA is a variation of Proof-of-Access (PoA) combined with Proof-of-Work (PoW) that dramatically reduces the amount of energy required to create new blocks. With SPoRA, miners can prove to the network they are storing data without having to verify every block of data on the chain. Using SPoRA, miners are not required to store the entire chain but encouraged to store as much of it as possible. This is because in SPoRA, generating the next block requires data from a randomly chosen block referred to as the “recall block” rather than the previous block. This is where the “weave” comes from. So, the more blocks the miner is storing, the more likely they are to possess the recall block and the data inside required to generate the new block. It is economically inefficient, and for most miners unfeasible, to store the entire chain. However, across the network all data is stored collectively by the miners as each miner is incentivized to store the data other miners aren’t in order to gain a competitive advantage in case that block happens to be a future recall block.

Synchronization Blocks

     After a miner proves they possess the legitimate data of the recall block, like classic PoW chains, they will race to be the first to compute the correct hash and display it to the network (keep in mind the miner must be storing the recall block in order to compute the correct hash). This begs the question, how do the miners not storing the recall block come to consensus on the legitimacy of the hash of the new block being proposed to the network? The answer is synchronization blocks. Synchronization blocks allow all miners to independently verify if a new block is valid when a miner claims to have found the appropriate hash even if miners aren’t storing the recall block. Synchronization blocks are special blocks that are generated once every twelve blocks (about every hour). These blocks store an updated list containing the balance of every wallet in the system as well as a full list of every block’s hash. These blocks are propagated through the network and their contents are compiled by consulting every miner and accessing their data and appending it to the list. This makes these blocks accurate records of the entire chain and is what allows them to be used for miners to audit blocks that require data they aren’t storing. It must be noted that while synchronization blocks provide miners with the tools to check the legitimacy of blocks built with data they aren’t storing, they do not provide miners not storing the recall block with the information required to compute the hash of that new block.

The Permaweb

     Another important piece of Arweave’s technology is a web page and application layer called the “Permaweb”. The Permaweb is critical to Arweave’s ambitions as it provides users with a traditional web experience when accessing data stored on the network. The Permaweb is a layer on top of the underlying blockweave (blockchain) where all decentralized applications are run. To attract more activity into the Arweave ecosystem, a familiar UX is setup for users and low cost, low difficulty deployment mechanisms for applications and pages are prioritized for developers. To further incentivize developers to build on the Permaweb, Arweave offers a developer toolkit compatible with familiar web languages (HTML, Java, CSS) to promote the growth of their applications. This is accompanied by an allocation of a percentage of the AR token supply specifically set aside to encourage development. All of this is attributed to Arweave’s ambition to create an uncensorable ecosystem of interactive applications backed by the security of permanence. Using the security and integrity of the blockweave, the permaweb allows Arweave to host a web-based ecosystem out of the reach of monolithic power structures.

Token Structure

     The native token deployed by Arweave is called the AR token and is used to coordinate activity and incentivize data storage within the network. This token takes an inflationary model and is used to pay for perpetual and theoretical permanent storage on the network. This is done using Arweave’s storage endowment, a pool that takes 86% of all transaction fees to distribute to miners over time and ensure sustainable long-term storage. Upon the network launch in 2018, 55 million AR were created in the Genesis block. Arweave will have a maximum supply of ~66 million AR as the remaining 11 million AR will gradually be brought into circulation as mining rewards, eventually being held in either wallets or endowment pools. This use of endowment pools also acts as a deflationary force on AR, as most of all tokens issued through mining rewards are locked into long-term storage. However, this can lead to the price of data storage on the network being considerably high, as Arweave deploys a very conservative pricing mechanism. When data is uploaded to the blockweave, this mechanism projects the cost of storage estimated to cover the next 200 years. This projection is made using highly conservative assumptions and is adaptable, automatically adjusting the payment levels required to sustain endowment pools and long-term storage. Endowment pools, transaction fees, and permaweb interaction are the intended determinants of AR demand. Surprisingly though, Solana holds the greatest influence on demand for AR.  This is because recently Arweave entered an agreement with Solana to provide permanent storage of their transaction history. The unintended consequence of this is now AR’s price is highly dependent on Solana, as an increase in activity on Solana increases the demand for AR tokens. This is less of a criticism of Arweave itself, and more of a representation of the growth it still requires.  It is critical to Arweave’s model that appreciation of the AR token grows free of influence from singular entities as AR is vital to the long-term sustainability of the network.

Conclusion

     While there is exciting innovation and potential surrounding Arweave, it is important to keep in mind that it is the data that is permanent, not the network. Like any distributed ledger, the network exists only as long as people are incentivized to run it. So, the emergence of a superior option could pose an existential threat as nodes will always flock to wherever they can accrue the most value. Until that day comes, Arweave is the best bet when it comes to decentralized, permanent storage. The protocol boasts an efficient storage mechanism, a developer friendly application layer, and involved token dependent network functionality. The combination of the three give Arweave an impressive design, very appropriate for the service they aim to provide. However, one could argue Arweave is quite limited from providing this service effectively. Using Arweave explorer, (Arweave’s native data/analytic terminal) it's evident that two vital aspects of Arweave’s model, decentralization and adoption, are both inadequate. The explorer shows a total of only 56 active nodes, nowhere near enough to create the level of security and permanence the program advertises. Furthermore, the average daily active address count falls in the range of between 800-1000 on a given day. The active contract metrics are not much better, yielding an average range of 100-200 active contracts daily. These statistics are a representation of how early stage Arweave currently is and provide some perspective on how much adoption is still required. While this is a cause for concern in the short term, their vision has always been long term so this isn’t a project I would be quick to write off. With the guidance of a passionate founder, a principled identity, strong VC partners, and the use of novel technology, Arweave has the foundation to succeed and establish itself as the world’s decentralized archive of digital permanence.

Sources

Subscribe to Blockchain at the University of Virginia
Receive the latest updates directly to your inbox.
Mint this entry as an NFT to add it to your collection.
Verification
This entry has been permanently stored onchain and signed by its creator.