alignDRAW - The CryptoPunks of AI Art

It’s not the perfect analogy, but now that I have your attention, read on to understand why I think this is one of the most significant pieces of AI art. Disclaimer: I own pieces from the collection.

The short version

alignDRAW, a 2015 paper by Elman Masimov, is regarded as the first text-to-image model. The author, in collaboration with fellowship, created an NFT collection of the outputs from the model. These are the outputs from the first time we taught machines to conjure images just by describing them with words.

a toilet seat sits open in the grass field
a toilet seat sits open in the grass field

The images are pixelated and blurry, but read the prompt and examine the outputs. It’s clear the model “understands“ the prompt. To me, the collection represents a significant milestone in human progress. It is equal parts science and art. Text based prompting is the main modality of AI image generation today. As the technology gets better and it proliferates through more and more parts of our life, we can point back to this as where it all began.

Credit: https://x.com/lifeofc/status/1735048737452646647?s=20 (must read)
Credit: https://x.com/lifeofc/status/1735048737452646647?s=20 (must read)

While this post was being drafted, Sora was announced. The pace of innovation is relentless and will continue picking up speed. Take a look at this video and see how far we’ve come in less than 10 years. What will the next 10 years look like? I have no idea, but I have extreme conviction that this collection will come to represent the entire cultural impact of text-to-image (and now video) AI generation. It’s the closest we can get to owning a piece of this civilizational milestone.

The long version

If you prefer video, this is a great introduction to the collection

Unlike most other NFT collections, Elman did not set out to create art. As a 19-year old University of Toronto student, he wanted to see if the models that were used to automatically caption images could instead be made to do the reverse: given a caption, create the image. He and his colleagues were successful. And in November 2015, they submitted the alignDRAW paper.

The published paper has 21 unique prompts with 8 images each to showcase the model behavior. These are called the “paper“ prompts. There are a further 21 prompts with 121 images each that were used to test the model. These “process“ prompts are not mentioned in the paper directly (but are included in the source file attached to the paper).

a very large commercial plane flying in rainy skies (paper prompt)
a very large commercial plane flying in rainy skies (paper prompt)
an airplane flying off into the distance on a clear day (process prompt)
an airplane flying off into the distance on a clear day (process prompt)

You can view all of the outputs grouped by prompt here.

Why should you care?

Context

Watch this 2022 Vox video for context from when Midjourney and Dall-E started taking off

In the year and a half since the video, the outputs shown already seem dated. And AI generated art has exploded into the public consciousness. It all began with Google’s DeepDream (July 2015) and the neural style transfer paper (Aug 2015). But these still only “converted“ one image to another.

In terms of influence, the 2014 GAN paper is unquestionably more important to AI in general. A close contender would also be DCGAN, which are what Robbie’s and many other early AI artist’s works are based on. Incidentally, DCGAN was published 10 days after the alignDRAW paper. 2015 was the year when AI based art truly began.

alignDRAW is the first time computers could generate meaningful images from scratch based on the user’s intent. The significance of this collection is in its breathtaking technical achievement and the promise of what is to come.

Provenance

Alright, so this was a 2015 paper, but the collection was actually minted in 2023. So, why is it worth anything? Is this “legitimate”? Can anyone just come out and mint anything whenever they want and claim it is historical?

There is only one answer to all of these questions, and it is: it depends on social consensus.

No, this did not start out on the blockchain, but it started out with a means of establishing provenance that has worked for hundreds of years: a published scientific paper. Elman was the author of the paper. The paper was published in 2015. The paper contains the paper prompts and outputs. These are undisputed facts. Elman, as the author of the paper and person minting the collection, is conferring his credibility to it. Elman is who makes the collection “legitimate” and creates social consensus around it.

It is possible to transfer provenance and credibility onto the blockchain. That’s what photographers and digital artists do when they mint a work. The work existed outside of blockchains, when they mint it, they confer their credibility to the collection/token and help bring consensus into existence.

The blockchain only attests that a token id belongs to a particular address at a point in time. Except for a very limited set of “in-chain“ collections, there are no other guarantees as to what the id represents. Everything else on top of the token id is layers of social consensus. Cryptopunks themselves are the canonical example of this. When they were released, the images only existed off chain. All the smart contract does is map a number to an address. In fact, there is actually a v1 version of the cryptopunk contract which is older than the v2 version. And yet, the v2 is what’s more valuable and v2 is what everyone refers to when talking about cryptopunks. Why? The artists disavowed v1, and the consensus is that v2 is the “legitimate“ contract.

Blockchain timestamp maximalism, similar to bitcoin maximalism, is not how the world works in reality.

Aesthetics

Let’s face it, this is not going to win any competition for aesthetics. The images are pixelated and blurry and sometimes just blobs of color. But remember, this is not an art project. It’s the result of a scientific endeavor. It is the very first time something like this has been attempted. Midjourney v6 did not come out of nowhere. There is no scenario where the outputs of the very first AI image model is a beautiful, 4k, photorealistic image. The Wright flyer does not look like a modern jet, and the first photograph looked like this

Nicéphore Niépce “Heliography” - The world’s first photographic process (1826) (credit: aligndraw.fellowship.xyz/historical-context)
Nicéphore Niépce “Heliography” - The world’s first photographic process (1826) (credit: aligndraw.fellowship.xyz/historical-context)

That said, some of the prompts are aesthetic once you know what to look for. You need to have spent some time looking at the whole collection and the prompts, and you begin to see which pieces show the model succeeding in representing the prompts.

A picture of a morning sky #113 (process prompt)
A picture of a morning sky #113 (process prompt)
A toilet seat sits open in the grass field #2 (paper prompt)
A toilet seat sits open in the grass field #2 (paper prompt)

Investability

When collecting NFT’s it’s always best to follow the golden rule from artnome

However, I’m also a strong believer that certain sovereign, networked, digital objects will become strong stores of value (Derek’s article is a must read if you have not come across it before).

Fellowship have taken pains to build the contract in such a way that it has minimal external dependencies. Currently, only the stop sign is on chain, and the other images and are on ipfs. But the collection is designed to be able to support bringing all the images on-chain at a later date.

I will not go too deep into the investment case for NFTs or AI art here. That’s a longer post for another time. However, the one thing worth mentioning is AI based art is driving the cost of creation to zero. There is going to be an explosion of art. The pieces and artists that will have enduring value are a) the historic “firsts“ that are immutable truths or b) the ones that have a lasting influence on the movement. It too early to talk about the latter. Therefore, I believe the former is the more obvious Schelling point for investors.

Collectibility

For something that did not start out as an art collection, Fellowship and Elman have structured the collection in an incredibly thoughtful way for collectibility.

There are clear “tiers” within the prompts. The stop sign is the “first“ prompt and output. The paper prompts are directly presented in the paper and are very strong outputs, especially viewed as a set. But even within the paper prompts, I view the more outlandish prompts as a better demonstration of the model’s capabilities. This also applies to the process prompts (which are not presented in the paper, but are included in the source file attached to it), but they have a lot more variability in quality.

As a collector, this is how I would rank the outputs (very subjective)

  • A full set of one of the more outlandish “paper” prompts (elephants in the sky, skiing in the desert etc)

  • Two or more outputs of the same paper prompt showing the range of the model

  • A full set of the 21 process prompts

  • Any one output from the paper prompts

  • 2-3 outputs from the same process prompt (bonus points if they are aesthetic or more legible than average)

  • Lastly, any random piece from the collection

This is a collection where a single floor piece is not very representative. It is best viewed as a set of at least 2-3 images for the same prompt. There is also varying levels of quality in the outputs of the prompt. Some prompts and outputs are much better than others. Almost all of the paper prompts are a step above the process prompts. Looking at the random assortment of thumbnails on OpenSea does not capture the magnitude of this collection. This is part of the reason why I sat down the write this post.

In summary, for an investor or collector looking for exposure to AI art, no other collection matches the provenance, collectibility or enduring significance as alignDRAW.

Drop mechanics and distribution

The collection itself consists of each individual output (2709 in total) as a separate NFT. The “stop sign” prompt and it’s set of 8 NFTs was auctioned via Christie’s and sold for 15eth (to myself). 18 of the remaining 20 paper prompts were auctioned as sets via daily.xyz and averaged slightly above 3eth. 1 was reserved for fellowship’s collection and 1 reserved for placement with an institution. 2015 of the remaining process prompts were sold via public dutch auction via fellowship.xyz, settling at the resting price of 0.1eth. The remaining 541 were reserved (some given away to other AI artists, some raffled among prominent NFT communities etc). Finally, 3 print editions were acquired by the Worcester Art Museum.

Risks

The collection is personally meaningful to me, and will remain so even it the price went to 0. That may not be the case for everyone else, there are risks to buying the collection, in order of priority:

  • Collection is just forgotten. There is not enough volume or interest maintain the attention required to keep it relevant. This is the biggest risk, it is somewhat unlikely given the prominence of text-to-image everywhere else, but it does exist.

  • Understanding and appreciating this collection takes too much effort. This has also lead to a skewed owner distribution (19% unique owners, which is pretty low compared to other collections).

  • There's an even earlier or more influential paper and a collection around that surpasses alignDRAW as the natural Schelling point. There are a few possible candidates: original GAN paper, google's DeepDream, neural style transfer paper or DCGAN. Of these only DeepDream and DCGAN are conducive to an NFT collection. A collection of DeepDream or DCGan’s initial outputs will do well, but they are not really text-to-image, and they would also bring renewed attention to alignDRAW.

  • Modern text to image models are not direct descendants of the alignDRAW architecture, so the impact of the paper on the models we use today is probably minimal. Best case, it was just a proof of concept, but a dead end otherwise.

  • There are problems with the original paper. e.g. plagiarism allegations, non-reproducibility etc

  • Text based prompting is just a fad, we quickly move on to other, better modalities

  • Elman/Fellowship do something that irreparably damages credibility. Just as Elman confers his legitimacy to the collection to form social consensus, he can take it away with damaging actions. e.g. Elman mints a copy of the collection on Solana. I believe Elman and Fellowship are good faith, long term actors, however this is a possibility.

  • Catastrophic smart contract bug/hack

  • Finally, people just don't accept that something that existed before blockchains can be brought on chain (essentially everyone becomes an on-chain maximalist, photography nfts are worthless etc).

Disclaimer

I own pieces from the collection. You can view all my pieces here. I have no intention of selling them anytime soon. I’m writing this post because I’m passionate about the collection and because I believe it is not well understood.

Thanks

Thank you to Elman, Alejandro, Studio137 and Astam for reviewing the draft of this post and providing valuable feedback.

Subscribe to delronde
Receive the latest updates directly to your inbox.
Mint this entry as an NFT to add it to your collection.
Verification
This entry has been permanently stored onchain and signed by its creator.