LODs for NFTs

April 29th, 2023

Basics of LOD

LOD (Level Of Detail) improves runtime performance by rendering simpler versions of models for distant objects. This helps shorten response time since you need to download less to start seeing objects, and overall save on network usage since unused LODs are never downloaded.

These benefits combine into the ability to create richer content. Good world builders today are hesitant to put many high quality models in the same scene for negative performance impact reasons that these techniques can help mitigate.

> Source: https://hubs.mozilla.com/labs/hubs-lod-support/ 👈 READ this!

ELIF (Explain like I'm 5)

Imagine you have a toy car. When you hold it up close, you can see all the tiny details like the wheels and the little stickers. But when the car is far away, you can't see those details anymore, and it doesn't matter, because it still looks like a car. LOD (levels of detail) in games is like that! When things are close, they have lots of details, but when they're far away, they have fewer details to make the game run faster.

GPT4 John Carmack explaining LOD

Levels of detail (LOD) is a technique used in computer graphics to optimize the rendering of 3D models. By creating multiple versions of a model with varying polygon counts, we can display the most appropriate version based on the viewer's distance. This saves processing power and maintains performance without sacrificing visual quality. In essence, LOD dynamically adjusts the complexity of 3D objects to balance performance and visual fidelity.

How to Generate LODs

https://gltf-transform.donmccurdy.com/ - glTF2.0 SDK
https://github.com/takahirox/glTF-Transform-lod-script - generate MSFT_lods
- https://github.com/playkostudios/model-splitter - great tool for LODs

Results when combining LOD with original model

LOD Triangles: 20k (original) -> 6.8k -> 6.6k tris
LOD File sizes: 2.6 MB (original) -> 485 KB -> 285 KB
Downside: Original total file size bloated from 2.6 MB → 3.1 MB with combined glTF

Did not include this with the dataset on GitHub, you can generate such with takahirox script and above command tho

2 LODs made with glTF-transform and model-splitter

In this example, I generate a LOD with 50% and 10% ratio (including textures) of 1.glb with two methods: the glTF-transform script from and model-splitter which exports separate files. Both of these are included in the Boomboxheads V2 assets repo.

Download sample here: https://github.com/gm3/boomboxheads-v2-assets/blob/main/Models/glb_LOD/combined/1_LOD.glb

Couple observations:

LOD 1 is 9x smaller than LOD0, but only looks good from a distance
The file size of LOD 0 + 1 < original file size
With more control over layers, file size can be smaller
- Tested by optimizing model manually in blender, decimating body separate from the glasses produced much better results

LOD + Draco
Here’s a look the size difference between the original glb files exported out of boom-tools and then draco compressed + draco compressed x2 MSFT_LOD:

Imagine only needing to pull ~50 KB for rendering distant models instead of the average ~2 MB file size each

420 original size glbs = 1.47 GB
420 draco compressed glbs = 863 MB
420 draco x2 MSFT_lod glbs = 883 MB 😯

Wait a second, 20 MB difference for twice as many glbs? I’m not really sure how, because that’s like a 43x difference between the LODs in file size, and would mean on average the LOD1 is ~48 KB (48 * 420 = 20 MB). However, when I exported a few of these out manually, they seemed on average about 260 KB in size.

WUT DA!?

I tested a few more and realized the combined LOD version using MSFT_lod extension is smaller than separate exported glTF files. Here’s a visual:

MSFT_lod file size is smaller combined than having the same glTF files exported separately

Downsides

Currently not that many platforms support the MSFT_lod extension
Even if the file size of a combined LOD model is similar to a draco compressed, if the platform doesn't have HTTP range requests it still downloads the whole file
Draw calls doubled, tested with gltf-viewer

When to Generate LODs

This topic could be worth a whole post in itself. In short, I realize how much upfront work there is to do to generate LODs and attach to metadata. The process outlined here won't scale if it has to be driven by the artists pre-minting.

Then I realized something. Our operating systems automatically generate thumbnails for various media files for our data at rest, so why not too for our 3D models and then by extension for the world computer? Afterall, collectors and platforms are more incentivized to run nodes that can do the type of automated processing for generating LODs from data at rest, and to seed the content that powers a decentralized network.

In my research I found tons of open source photo and video gallery software will automate the process of generating thumbnails, perhaps some of the techniques for generating LOD for 3D models outlined in this post can be applied similarly for 3D galleries?

This way would lift a ton of upfront work to allow artists to just mint the final piece while others do the work of gracefully degrading such various levels of detail. Many of the platforms already do this for images, but don’t directly offer you the files. Instead, usually these files are produced by marketplaces that send them around via CDNs for improving loading times.

Another possibility might be for virtual world platforms to do LOD processing on the backend, and then having the ability to export the results for use across the metaverse.

Progressive Loading

If you only need LOD1, does it make sense to download the entire file (~8x bigger)?

Earlier in the experiment LOD1 was 285 KB and LOD0 was ~2.1 MB. For objects that are in the distance, needing to download the entire file when you only need a small representation is wasteful. Generating LODs is one thing, loading is another that is up to platforms to implement. One method would be to integrate HTTP range requests and progressive loading, as illustrated below (source: https://hubs.mozilla.com/labs/hubs-lod-support/)

HTTP range requests enable partial downloading. An HTTP range request asks the server to send back partial data of a file. HTTP range requests enable downloading a specific level of a bundled glb file. It has a 206 Partial Content success response code.

Btw all modern web browsers support range requests, it just isn't fully supported with 3D web graphics libraries like threejs, playcanvas, babylonjs, etc for glTF loaders yet.

If interested in going deeper on this topic I highly recommend reading the Hubs LOD support blog post linked below. The author saw tooling that didn't exist yet for MSFT_lod, so he wrote it for multiple libraries and blender. There's no an open PR for HTTP range requests slated as a milestone for the next major release of Threejs thanks to takahirox 👏

Other ideas came from the digital assets group in OMF with discussions like GLTF as an arbitrary-binary container format that has a simple JSON interface, and the Dacti package format which is optimized for sparsely fetching data from CDNs in real-time.

Perhaps it's also worth seeing if this project can become a case study for glXF, a WIP specification from the Khronos Group for creating a new file type in the glTF ecosystem for efficient composition of multiple glTF assets.

What about reading the paths to the files from the NFT metadata? 🤔

LODs in NFT Metadata

Having references to 3D model LODs in NFT metadata has some benefits and shortcomings over having LODs combined with glTF files via MSFT_lod extension:

Easier updates

If a new, improved LOD model becomes available, it may be easier to update the reference in metadata rather than updating the entire glTF file.

Reduced file size

Including multiple LODs within a glTF file can result in larger file sizes, which can negatively affect loading times and performance in platforms that don't support MSFT_lod and progressive loading. By referencing LODs in metadata, it can reduce the size of the glTF file and make it easier to manage.

HOWEVER...

Having references to LODs in NFT metadata requires additional work to implement, as it involves creating a separate data structure to store the references and handling the loading and swapping of LOD models in the application code. Additionally, not all platforms or applications may support LOD references in metadata, so developers may need to provide alternative solutions in those cases.

Also, if the LOD asset is being pulled from separate IPFS hashes then the speed difference might be negligible and maybe worse off if relying on this method alone. Correct me if I'm wrong, but since you're dealing with disparate files for LODs then it means you'll expect they need to propagate the CIDs individually from the swarm, which can vary by how long it takes to resolve. Perhaps this might be mitigated and worth testing if all the files for the NFT are bundled together in a folder while utilizing IPNS, or via some combination with ENS domains.

Right now we have all the files for boomboxheads v2 organized in a GitHub repo that we can test with after minting is done. If interested, reach out on M3 org discord - we have a decent supply of Filecoin and other tokens that we can put to open bounties.

Multi-asset NFT Metadata

There’s a few multi-asset NFT metadata schemes that we’ve been looking at for the past year that are being developed by MetaFactory, Nifty Island, Cryptoavatars, and Metamundo. I think it’s great that we are all taking slightly different approaches to the same problem while still early to see which ones work out the best.

Also, I don't think it matters too much in the end because AI can pretty much one-shot translate between any and help us write data transformer programs that can automate converting between, such as this simple bash script.

For additional tooling, we can integrate LOD import / export directly into popular 3D programs such as Blender. Here's a couple programs that would be useful starting places for interested devs:

Extensible Token Metadata (ETM)

Link: https://etm-standard.github.io/ETM_MULTIASSET_v1.0.0

As the use cases for NFTs have expanded, the need for an NFT to represent multiple files has grown and been addressed in several ways. However, the inconsistency in these varied approaches has rendered assets beyond images and videos unable to be ingested by NFT consumers in a streamlined way. This has forced consumers to create custom centralized tooling to access specific assets - a path that is not scalable or maintainable in the long-term.

This standard is an extension of ETM and provides a decentralized approach to representing multiple digital assets with a single NFT in a scalable and maintainable way.

The goal is to provide a streamlined approach to the following:

Associating multiple assets with an NFT such that custom or centralized tooling is not required

Supporting NFTs with heterogeneous media types (e.g. 3D models, animations, etc.)

Providing a clear definition of the file type for files associated with an NFT

I talked about ETM in the last blog post if you want to go back and read that first. Now I just finished up with the script that converts all boomboxheads v2 NFT metadata to ETM metadata spec, the differences between them can be seen below:

Here's the current metadata for Boomboxheads V2 next to the ETM versions. If we mint out we can update the collection with the new spec

We've added additional fields for the glTF LODs, the 4k image, and replaced vrm_url with file_type within the assets node.

ETM is still experimental, but having spoken with several project founders all exploring how to link multiple assets to the same token this spec seems to be the right direction IMO. More testing needs to be done by projects that have ready made assets for such.

Additional Thoughts

I believe we're near the cusp of having a feature rich playground for metaverse builders exploring LODs for NFTs. There has been a mini Cambrian explosion of experiments with different approaches to represent the same digital asset / character with a token.

Sample of projects in web3 gaming scene exploring various levels of detail for their avatar collections

Case Study: CyberBroker’s Mechs

Mechs 101 - Primer on Mechs
Mech Assembly - Links to buy parts and assemble one

The Cyberbrokers are a world-class team of folks who deeply understand the on-chain medium. The Mech design process is quite interesting:

Each mech part is a NFT you can mint / buy on secondary
Parts can be combined and recolored into a new NFT via web app
LOD files automatically generated and then encoded into the metadata

After assembling a Genesis Mech you get a fully-rigged 3D asset in a variety of formats such as GLB and VRM with multiple resolutions each. These files are easy for owners to access, not locked behind a private API, and can be immediately usable across apps.

The fully rigged 3D files are referenced in the description and the metadata of the token. Some new to crypto scene might think it’s far too easy for others to yoink those assets. Okay, well even if someone did that it would be simple for web3 supported platforms to verify if such is authentic or not. Since the assets are referenced on-chain, platforms that support wallet connect can read and import the avatars directly from the metadata and give it whatever it is their equivalent of a blue checkmark. Even if someone minted a mech on a different chain, it would not be the same because the private key that minted the official Genesis Mechs contract is indisputably different, so people could basically tell it came from a different factory.

Notice how they included the multiple LODs in the metadata, making it easy for platforms to ingest

One thing I would probably like to see added are checksum / hashes in the info.json or elsewhere. Since the files are renamed and referenced from the cyberbrokers website and not a cryptographic hash (like IPFS CID) I think it would be a nice addition.

If you know about other projects doing something interesting on-chain with various levels of detail for the assets come swing by the M3 discord and say hi!

Conclusion

I already batch converted all the boomboxheads V2 to MSFT_lod versions and as separate LODs and the NFT metadata to ETM spec to prepare for as a metaverse interop science lab, but we cannot update the collection until it finishes minting out. That's fine because in the meantime we should do some isolated experiments to see how well the spec performs with practical testing by minting a few NFTs with ETM to test with.

If interested in joining the conversation about ETM check out Github discussions. If interested in supporting open research to benefit the 3D NFT scene then why not mint a Boomboxhead - so far we are near 25% of boomboxheads v2 being minted. Lastly, thanks again for reading and supporting our work! <3

See you in the metaverse
- jin

This post is also a digital garden: https://hyperfy.io/artroom

Subscribe to M3 (Metaverse Makers _)

Receive the latest updates directly to your inbox.

Mint this entry as an NFT to add it to your collection.

Verification

This entry has been permanently stored onchain and signed by its creator.

Arweave Transaction

cDYFmh0Go1G_RLQ…fqCz1Fb6dnloa0g

Author Address

0x4f074f7F08C8eFC…0dCB443AC17934F

Content Digest

g-XlaAmVCBXUxdF…V4XL6HuvOJebSSY