Streamline IPFS file uploads with CID precomputing

April 10th, 2023

The Inter Planetary File System (IPFS) has become an important tool for building decentralized applications, especially with the increasing adoption of NFTs and need for off-chain storage.

Every day, millions of files are uploaded and pinned to IPFS to store data relevant to tens of millions of users.

Often, the same files are uploaded to IPFS over-and-over again:

Developers frequently create testnet versions of their NFT contracts and upload all of their assets to mint NFTs, only to re-upload all their files later when they create their mainnet collection.
Common media files like popular artwork often get coincidentally re-uploaded to IPFS many times by different users.
At thirdweb, we upload ABI files to IPFS every time a smart contract is deployed using our tools (you can checkout this post on self-verifying contracts to learn about why we do this)

This repetitive upload process consumes unnecessary time and resources and makes for a painful user experience.

Thankfully, there’s a way to eliminate this and ensure that once a file is uploaded to IPFS, user’s will never need to upload it again.

The magic lies in a method called CID precomputing and allows us to drastically improve the UX of working with IPFS.

Let’s take a look at how this works!

What happens when you upload files to IPFS?

Before we take a look at how CID precomputing actually works, let’s first review the process of what happens when users upload files to IPFS:

First, a user uploads a file from their local file system to the browser
The browser then sends this file to a server
The server stores this file somewhere (like Filecoin, Arweave, S3, etc.) and “pins” it to IPFS
The server obtains an IPFS Content Identifier (CID) for the file
The IPFS CID is relayed back to the user and can be used for other purposes.

For context, an IPFS Content Identifier, or CID, is a unique identifier for each file on IPFS. It can be used to access the file by making a request to the ipfs://<cid> URL or getting a proxy to the content of this URL via an IPFS gateway.

Importantly, the ultimate goal of this entire upload process is to return a CID to the user.

The goal of CID precomputing is to short-circuit this process by skipping steps 2-4 and return a CID to users directly in the browser without having to upload any files to the server.

How does CID precomputing work?

The real power in CIDs lies in the method that’s used to create them.

When files are uploaded to IPFS, they aren’t just assigned a random CID.

Instead, each file is split into many smaller chunks of byte data. Then these chunks are repeatedly hashed together until a single root hash is created, which becomes the files IPFS CID.

This process is deterministic, meaning that any file with some specific data will always result in the same root hash, and thus, will be stored and accessible via the same IPFS CID.

CID precomputing takes advantage of this deterministic nature of CID computation.

Here’s how it works:

First, we follow the specified deterministic process to calculate the IPFS CIDs of our files in the browser, before uploading them to the server.
Next, we have to check if the file has already been uploaded pinned to IPFS. Since pinned files are always accessible via the ipfs://<cid> URL, we can simply make a HEAD request to this URL, or a gateway that proxy’s to it. This let’s us check if the file exists (based on if the response gives us a 200 HTTP response) without having to download the entire file data.
Finally, if the file is found, it means the file has already been uploaded to IPFS and is accessible via the computed CID, and we can short-circuit the process within the browser!

How do you use CID precomputing in your apps?

All applications allowing users to upload files to IPFS or doing any uploads to IPFS internally stand to gain significant performance benefits by integrating CID precomputing into their flows.

This is why we’ve made it easy to use CID precomputing on all uploads by using our decentralized storage SDK.

Then, with the following simple snippet, you can enable IPFS uploads with this optimization automatically built in!

import { ThirdwebStorage } from "@thirdweb-dev/sdk";

const storage = new ThirdwebStorage();

// CID precomputation will automatically take place on all uploads
const cid = await storage.upload(file);

Additionally, we’ve exposed lower level utilities for you to do CID pre-computation manually in your own flows.

You can use our getCIDForUpload function to do the deterministic CID computation for your file uploads, and you can use the isUploaded function to check if a file with a given CID has already been uploaded to IPFS:

const fileName = "image.jpg";
const fileData = fs.readFileSync("path/to/file");

const cid = await getCIDForUpload([fileData], [fileName]);
const isUploaded = await isUploaded(cid);

You can get access to all this functionality by installing the thirdweb storage SDK via npm install @thirdweb-dev/storage. With these tools, we hope to improve the DX of working with IPFS!

Thanks for reading - you can learn more about the Storage SDK on the official thirdweb Storage Documentation.

Subscribe to Adam Majmudar

Receive the latest updates directly to your inbox.

Mint this entry as an NFT to add it to your collection.

Verification

This entry has been permanently stored onchain and signed by its creator.

Arweave Transaction

6mGAbB9ZPCaBR4B…1UoyAQweHz0LJ6s

Author Address

0xf7Ef72Cb6840010…532f5729BA61B26

Content Digest

ZSH-0T6MYf0L431…lqhaiP6dffiIthM