Primer on Decentralized Storage
0x85d3
June 4th, 2022

Written by your friends at NeptuneCircle (Stephen, Mike, and Kris)

Subscribe to @neptune_circle for more updates. We will post on Mirror and Twitter.

This article is largely based on the research report published by The Block Research, which we highly recommend everyone to take a look (link at the bottom). We have added some thoughts and details beyond that.

Table of Content

  1. We’ll first talk about our thoughts on the market;
  2. We will then dive into the two types of decentralized storage: (1) contract-based and (2) incentive-based;
  3. There is actually a new structure: Stratos; and
  4. We will close off with some additional case studies.

Market Thoughts

  1. So far, NFT and web3 storage are the prominent use cases, but emerging new applications in areas ranging from audio/video to gaming to computation and more are making the case for global decentralized storage networks accumulating more of the cloud storage market.
  2. “Wash Storage” is possible. The original Filecoin incentive structure gives the storage providers incentive to start abusing the system by self-dealing to increase their chances of winning mining rewards. To increase the quality of committed capacity, Filecoin launched Filecoin Plus program to incentivize storage providers to participate in verified deals with real clients by increasing their block reward share for these deals.
  3. The question remains as to why the current Web 2 sector requires decentralized storage. Even though the ethos and the spirit of Web 3 demand a decentralized platform of storage, there seems to lack a fundamental demand as to why we need to use a decentralized platform as of this moment. However, as more Web 3 projects are developed, it is possible that the demand in this case will increase dramatically.
  4. The model of decentralized storage can work if the total storage supplied in aggregate is greater than or equal to the storage demanded. The market viability depends on whether the cost of decentralized storage is less than or equal to comparable centralized services. Unlike centralized storage clouds, building a decentralized storage network only requires marginal investment costs determined by an open market of participants rather than a single entity incurring the costs for all dedicated storage hardware.
  5. Data Redundancy is inherently required for decentralized storage. To achieve data availability and integrity across potentially unreliable nodes hosted by strangers over a wide area network, the source node will add some level of redundancy to each data chunk. Then, the system can recreate the entire block even if some nodes go offline (e.g., due to loss of network connectivity, hardware failure, the computer is switched off).

Current State of the Market

Large Potential Market Size for Cloud-based Storage. Not only the potential total addressable market for cloud-based storage is huge, but the security function of decentralized storage protocol also enhances the trustworthiness of the network. Blockchain-enabled decentralized storage has gained interest and adoption at an explosive rate since early 2021, coinciding with the rise of the mainstream NFT market and surging interest in web3.

Source: IoT Analytics
Source: IoT Analytics
Source: The Block Research
Source: The Block Research
The current state of the market.
The current state of the market.

Overview: Decentralized Storage

One of the main obstacles facing progressive decentralization in web3 is how to store large amounts of data in a decentralized way, given the prohibitive costs of storing data directly on popular blockchains like Bitcoin and Ethereum, which are not designed for storing large files (e.g., NFT metadata and hypermedia).

For example, it would cost about ~$20M to store 1 GB of data on the Ethereum blockchain. [The Block Research]

In contrast, storage chains like Filecoin separate the consensus aspect of their blockchain from the storage of files. Cryptographic securities and economic structures are used to align incentives and write the storage state to a shared consensus that can be verified by users. Filecoin, Sia, and Storj are contract-based decentralized storage protocols in that storage buyers and sellers negotiate temporary storage deals in open markets. In contrast, Arweave is an incentive-based and permanent decentralized storage protocol. Arweave attempts to address the limitations of storing data on blockchains by utilizing a “blockweave,” a blockchain derivative that is designed specifically for handling large amounts of data storage. [The Block Research]

Decentralized Storage #1 - Contract-Based (like Filecoin)

  • Storage buyers (demand) pay storage providers (supply) for storing and retrieving files off-chain
  • To maximize availability and fault tolerance, these files are usually split into chunks stored across multiple storage providers for redundancy or they utilize error-correction technology. Some protocols encrypt the chunks themselves (content encryption) while others encrypt only when sending data (transport encryption).
  • Since storage providers are not trusted entities, they are required to collateralize (e.g., via staking) or be vetted in some way for the opportunity to work in the network.
  • A system of rewards (e.g., mining rewards, gaining network power) or punishments (e.g., withholding rewards, slashing, losing network power) is used to try to ensure service guarantees.
  • Step 1 - Storage: Client and miner come to an agreement about the terms of the deal, including fees, storage duration, starting time, and so forth. The deal and its terms are published on chain, making the miner serving the client publicly accountable for servicing the deal. Miners then pack client data into a sector of Filecoin’s storage mining sybsystem, prove once on chain that they have stored a full and unique copy of the data (a computation-heavy process known as “sealing”), and then later prove continuously on chain that they are upholding the storage deal throughout its lifetime.
    • How to Prove that the Provider is Doing Its Job?
      • Storage providers must complete two proofs to prove they’ve met storage requirements: Proof of Replication (PoRep) and Proof of Spacetime (PoSt). PoRep is used in sealing the data, where a storage provider proves that they are storing a unique and full copy of the data via a series of zk-SNARK setups which together prove that the process was done correctly. PoRep takes place just once during the initial deal between the client and storage provider when the data is first stored by the miner. [Bascially a zkproof of the entire piece of data]
      • On the other hand, PoSt is used daily to prove that the storage provider is continuing to store the original data over time without manipulation or corruption. In PoSt, zk-SNARKs are used to verify that random pieces of data from the full copy are still stored and available for retrieval throughout the lifetime of the storage contract. If the storage provider fails PoSt, they will lose some or all of their collateral. [Bascially a zkproof of a random piece of data]
  • Step 2 - Retrieval: In Filecoin retrieval deals, clients pay miners to fetch their data via off-chain payment channels. Payment channels are the technology powering Bitcoin’s Lightning Network = low costs.
The two places where FIL coin is used.
The two places where FIL coin is used.

Decentralized Storage #2 - Incentive-Based (like Arweave)

Comparing Arweave and Filecoin.
Comparing Arweave and Filecoin.
A key example of incentive-based storage: is Arweave.
A key example of incentive-based storage: is Arweave.

Permanent Storage. Arweave is combining an incentive-based endowment pool with a blockchain-derived technology for creating a global permanent hard drive. Most of the demand for Arweave is from SOL and we see a similar price action between the two.

SOL's price action.
SOL's price action.
Arweave's price action.
Arweave's price action.

SOL is tied up via KYVE:

Structural design among Arweave, KYVE, and SOL.
Structural design among Arweave, KYVE, and SOL.
Blockchains/Platforms already collaborated with KYVE.
Blockchains/Platforms already collaborated with KYVE.

Could ETH use Arweave? Maybe - the Purge:

Mechanism Design of Arweave

The hypothesis put forth is that the blockweave is better designed for data storage than typical blockchains due to how new blocks are inextricably linked to not only the last block but a random historical block. In combination, all the data required to process new transactions and new blocks is memoized into the state of each block. Therefore, miners do not need to store the entire blockweave. Instead, miners are incentivized to store as much of it as possible to maximize their chances of earning mining rewards if they can prove access to the last block and recall block. Effectively, this makes Arweave a state-sharded storage system where no single miner needs to store the entire state, yet the global state is collectively stored by all of the miners.

Ther Permaweb is where the DApps and users are interacting with the Arweave data. The Permaweb could alos be interacted with regular Web 2, including Google etc.
Ther Permaweb is where the DApps and users are interacting with the Arweave data. The Permaweb could alos be interacted with regular Web 2, including Google etc.
  • Storing content directly on a blockchain derivative called the “blockweave.” While blockchains are a linear chain of blocks containing transactions, the blockweave is a linear mesh of blocks that connect each block to its previous block as well as a random historical block
  • Securing that content with a novel consensus mechanism called “proof of access” (PoA). PoA is enabled by the blockweave and forces miners to prove that they have access to old data to add new data.
  • Using single upfront payment for permanent storage instead of storage contracts for time-locked storage.

How is the one-time fee calculated: Arweave assumed the costs of storage will decline over time and sum them all up. This costs is right now much more expensive than AWS.

How is Arweave Unique: (1) Arweave is smart contract compatible - meaning that all DApps could build on this permanently stored network and all information on Arweave could be searched via Google.

Arweave Ecosystem:

Source: Foresight Ventures
Source: Foresight Ventures

Competitors:

  • Arweave may face a major contender for permanent storage in the near future, with Filecoin planning to launch its Filecoin Virtual Machine (FVM) this year. The FVM enable developers to automate permanent storage into Filecoin smart contracts. Both the FVM and Arweave will be able to support data archival and creating a preservation layer for humanity’s most important information.

Why Decentralized Storage is Useful?

  1. One of the primary use cases for decentralized storage is storing NFT metadata and hypermedia. NFT metadata refers to a JSON document that includes descriptive information about an NFT, such as its name, what it is about, a link to the associated hypermedia, traits, and so forth. NFT hypermedia refers to the graphics, audio, and video representing digital art, profile pictures, collectibles, music, and so forth that people are paying to own when buying NFTs.
  2. The long-term goal of Web3.Storage is to become the de facto storage layer of web3. To that end, Filecoin is building its partnerships with major protocols in the web3 space, including Polygon and Solana. In August last year, Polygon announced its native bridge to Filecoin, envisioning an interoperable environment where Filecoin brings greater functionality to Polygon applications that require decentralized and verifiable data storage services.
  3. Another area rapidly developing in the decentralized storage ecosystem is developer tooling and the communities forming around them. From integrative development environments to web hosting to various services for facilitating file and code storage and usage, it is becoming clear that developing on decentralized storage networks is rapidly streamlining.
  4. Others:
    1. Metaverse/Gaming – Decentralized storage can be useful for user monetization in metaverse games. For example, in Mona, gamers display their digital artwork in virtual art spaces and sell them to collectors. Any transaction transferring data from one player to another can be tracked on the Filecoin blockchain. In addition, Blast and Gala Games use Filecoin for backup data supporting gamers’ payments and revenue generation.
    2. Social Networks/Communication – Matrix is a communications protocol built to support chat, VoIP, IoT, VR/AR, social, and more. Originally a server-oriented network, Matrix is evolving into a hybrid P2P network with IPFS supporting the P2P Matrix. The vision is to empower users to have more autonomy and privacy over their data (e.g., by storing the data in IPFS by embedding their own servers into their Matrix client).
    3. ComputationCeramic is a decentralized content computation network. It is built upon IPFS and features a permissionless design that allows anyone to openly create, discover, query, and build upon existing data without needing centralized servers, oneoff APIs, or worrying about data integrity owing to IPFS data integrity controls. It is compatible with persistence networks, including not only Filecoin but also Arweave and Sia.
    4. Content Publishing
    5. Content Delivery Networks
    6. Permanent Storage/Web2 Datasets
    7. Decentralized Identity

Case Studies

Filecoin and Arweave have been discussed extensively above, so we will skip those.

Stratos

What is unique about Stratos? Stratos network has three layers: (1) blockchain layer; (2) meta-service layer; and (3) resource layer. Stratos network offers a decentralized blockchain, storage, database, computation, commercial data-processing, and content-delivery services all in one network.

Stratos currently only has a small market cap. FIL at $4Bn market cap; Arweave at $1Bn market cap, and Theta at $3Bn market cap.

Source: Stratos whitepaper.
Source: Stratos whitepaper.
  1. The Blockchain Layer. The blockchain layer calculates, compensates, verifies, and finalizes the activities that occur on the network as a whole. This is basically the layer to execute the payment part of the storage service.
    1. Service fee settlement;
    2. Incentive mechanism settlement;
    3. Payment services;
    4. Storage content verification; and
    5. Verification services for the resource consumption in the data mesh.
  2. Meta-Service Layer. The meta-service layer utilizes PoA (Proof of Authority) to elect meta-nodes randomly to execute indexing, auditing, managing, and routing services. This layer is also in charge of retrieving the data.
  3. Resource Layer. The resource layer utilizes Proof of Traffic to calculate incentive returns for each service provider. This layer provides services such as storage, computing, content delivery, PaaS, SaaS, and more. The storage layer of the resource node determines the storage method according to data types and will adopt different execution strategies according to different data types. This is basically the layer. This is also the layer where actual storage happens.

How is Stratos different? (1) automatically balance data allocation (driven by regional demand), thereby reducing latency; and (2) automatically clean up replicas.

Stratos Token: (1) the storage service provider receives STOS, (2) users pay STOS based on usage of resources.

STOS' Utilities.
STOS' Utilities.

Stratos’ Investors. Fenbushi Capital is founded by Vitalik.

Stratos backers.
Stratos backers.

Sia (and Siacoin)

Sia offers decentralized cloud storage to users by dividing their data into smaller pieces that are then spread across nodes within the network. The network offers "underutilized hard drive capacity" to users, which is kept secure with blockchain technology. In addition, data transactions are protected with smart contracts, which allows for a more affordable cloud storage option. Sia users keep their private keys, meaning their stored data cannot be accessed or altered by others.

Storj

Like Filecoin and Sia, Storj splits your data into smaller pieces, increasing its durability and security. Storj splits each file into exactly eighty pieces each time, and each piece is sent to a separate node. You can rent storage space from other users within the Storj network, though this comes at a fee.

Like the other two storage platforms here, Storj has its own digital currency by the same name. Storj pays its users using their native currency to rent out their storage space. When one does this, they become an SNO (or Storage Node Operator). Many choose to make money using Storj instead of renting out cloud storage space for their data.

Cited Sources

https://www.theblockcrypto.com/post/149172/decentralized-storage-a-primer-commissioned-by-w3bcloud

"https://www.makeuseof.com/what-are-decentralized-storage-cryptos/"

"https://www.youtube.com/watch?v=lWofsn2zqWc&t=1s"

"https://www.foresightventures.com/focusdetail-16.html"

Arweave TX
2eDwpbdq6LuqD2SFCidhwjzbc3Gg7bH7gJU7RWjT1eE
Ethereum Address
0x85d3AAF7560395FD9a061EEBE5999A1cDCFBE32d
Content Digest
3Qj6DW6MVDkgGDU7BftcH7bJ6H3kFR3grPwqXAXn2A4