How Ceramic fixes the missing part in on-chain data management and what next
February 23rd, 2022

TL; DR: Different from most data-related middleware: It provides a protocol that defines how data can be stored, updated, and accessed, instead of managing the dataset itself. However, it is still hard to handle real-time data syncs, especially for big data in Web3.0 using Ceramic Network.

All data in ceramic network will be stored in IPFS in three standard doctypes: 3IDs, account links and tiles.

First, data will be formatted as designed in tile files, which make tiles the basic block of all files. Other types of information can also be stored in tiles. Metadata, policies (collection policy, service policy, and privacy policy), agreements and claims are all use cases of tiles.

The second type is decentralized identifiers (DIDs), which stand for unique identities that can be used to sign documents and interact with different smart contracts. The third one is account links, which provide the cryptographic information of different DIDs. These two types of data serve as an essential component in web3 world.

Ceramic itself does not provide data services like what dune analytics or other data API service providers are doing. It simply provides a protocol to all data structure definitions, allows governed data access and updates. Different data service providers can simply use this protocol to run their businesses like data hosting, indexing, payments, or other arbitrary/web APIs. This means ceramic protocol serves as the base layer of all data service providers, different apps dependent on DIDs and social graph data.

Something to notice is that, the query and put in IPFS is still very time-consuming and not there yet to update social graph under seconds. Typically, data updates require two steps: find the storage location and then update the content. Ceramic uses the structure of Git Trees (only recording the changelog, defined as “stream” in Ceramic), which allows the skip of the query/localization step when updating data. However, it still takes several seconds to update data. Even with the most updated version of go-ipfs, it still takes ~2 seconds to put new data and query data. In a social app with more than tens of thousands of users, it is very unlikely for users to stand more than one-second response time.

Innovations in Ceramic are:

  • Lowering the development barrier/ providing a template for building an in-house database
  • Allowing cross-referencing across different databases
  • The innovative git tree structure of data storage/updates
  • Data sovereignty implemented by DiDs and allow DiD-governed data accessing

Things to work on:

  • Real-time updates of on-chain data especially social graph data (which is nearly impossible for current response time in IPFS), with either more optimized data structures or query algorithm
Arweave TX
Ethereum Address
Content Digest