9–5 metaverses and weekend metaverses, touch grass metaverses and touch simulation metaverses, living metaverses and dying metaverses. With a hypnotic timbre, Mark Zuckerberg tells a story about the world to come that says everything there will be just as it is here. Just as our room is now, so it will be in the world to come; where our baby sleeps now, there too it will sleep in the other world. And the clothes we wear in this world, those too we will wear there. Everything will be as it is now, just differently textured.
The familiarity of variable-poly surfaces, however, belies the novel forms of visuality that animate the metaverse. Beneath the illusory layer of avatars and environments, techniques such as subtle gaze direction exploit research on human visual cognition and attention, while computer vision algorithms are refined by data synthesized using computer graphics. For humans, these new images operate below the threshold of perception; for machines, they provide the foundation of perception. Machine learning models trained on synthetic data, which is diverse and perfectly labeled by design, generalize better than models trained on real data, enabling avatars to represent users in granular detail. The development of immersive technologies constitutes its own form of vision.
SMPLverse NFTs use facial recognition to retrieve data synthesized by Microsoft to train face tracking algorithms for its mixed reality headsets. Like many pfp projects, the Microsoft dataset uses procedural generation to randomly create and render faces without any manual intervention. The images combine a generative 3D face model with a library of artist-created assets—including textures, hair, and clothing—and SMPL, a body mesh used to animate avatars in virtual environments. Though the synthetic images are indifferent to human vision, they tempt an overidentification: their verisimilitude is formally irrelevant to machine learning, yet it remains the only method of achieving functional equivalence with real data.
After minting, you receive a token with which you can submit an image through a webcam interface. Your submitted image is written to the token as a hash and matched to the next-closest SMPL using a facial recognition model. Whereas the matched image mimics the form of your identity, the hash stored in the token preserves its content, although indecipherably (the image is hashed using a one-way compression function). When the image is matched, it receives one attribute: confidence. Confidence assesses the likelihood that your image matches the SMPL you receive. Confidence is a declining measure: as more SMPLs are matched, your likelihood of receiving a high confidence match decreases.
Visually, the training images suggest identity play, emulating the avatars whose naturalism they are meant to enhance. Their effect, however, is to attune algorithms ever more precisely to users’ physical selves: the dataset expresses how users appear to virtual environments as much as how they appear in them. The minting process elaborates this tension between individual agency and computational constraint: even a pool of 100,000 images is too small to deliver high confidence matches consistently.
Unlike the elective avatars that crowd the metaverse’s surface, SMPLverse NFTs are avatars for the metaverse’s infrastructure. Beyond digital culture’s native identity paradigm—whereby identity is constructed and maintained strategically, as if from a third person perspective—SMPLverse maps the nonhuman identities conceived by machine learning. Though the secondary market may stage the traumatic return of the elective avatar repressed by the user’s biometric identity, it cannot mask the fact that the synthetic images are themselves proxies for seeing machines, which exceed and overcome the human binary of pseudonymity and identity.