Using DALLE-2 to illustrate stories

0x0f67
April 11th, 2022

During the last few months, me and my partner, Gene Kogan, have been experimenting with illustrating a short sci-fi story I’ve written and recently published with Untitled Frontier. Last week, Gene got access to DALLE-2, so we decided to make more experiments with AI and story illustration.

We’re all just discovering what this new model is capable of, so we should do more than one experiment. At first, the idea was to create prompts based on the story and see what kind of artistic styles would match it better. It’s a tale told from the point of view of a self proclaimed Martian, who does their best to translate their experience into human terms. They want to help a girl, Diana, come to terms with her new reality after an adventure through humanity’s memorial (we call it the Human Memorial Monument, HMM). In this far away future, humans live in the Logged Universe, an extremely complex simulation that runs inside the memorial.

DALLE-2 gives great insight into character development. Below I share some of our trials.

For concept art, this is clearly a fascinating power. Each of these set of images were generated in about 20 seconds. The knowledge that has been widely shared by the Twitter AI art community during the last few months has proven key to guide the model. Remedios Varo was an astonishing surrealistic painter, whose work has recently gained new attention in the art world. Her style became a popular reference for AI art, and I was very pleased to see how well it merged with other attributes, like ‘made with Unreal Engine'. When I'd describe the work as an oil painting but also mention Unreal, the images would come out with more shadows and depth than in the cases Unreal was not included. It's also possible to describe hues, textures and much more.

The editing tool of DALLE-2 is still quite limited if compared to what Adobe has been adding to Photoshop's Neural Filters, but it is already very useful for composing scenes.

James Turrell installations merged really well with Remedios style
James Turrell installations merged really well with Remedios style
Adding a character to the scene, especially if described in a similar style, gave us interesting results
Adding a character to the scene, especially if described in a similar style, gave us interesting results
There were two more phases here I forgot to screenshot, in which we added a robot and then a plant between the characters. Then we decided to change the surroundings
There were two more phases here I forgot to screenshot, in which we added a robot and then a plant between the characters. Then we decided to change the surroundings
In the current format, it's hard to change only small details in the scene. Either they don't change anything or one has to select a large area and much is transformed.
In the current format, it's hard to change only small details in the scene. Either they don't change anything or one has to select a large area and much is transformed.
These are scene variations auto generated by the model, with no extra prompt. With all the mixed references, this is an illustration style native to Dalle2.
These are scene variations auto generated by the model, with no extra prompt. With all the mixed references, this is an illustration style native to Dalle2.

As one more example of the model being used for concept art, here are several versions of the prompt “The Nyan cat is a tool for tricking minds into compression".

By combining a lot of different references it's possible to create a new style

What I'd love to do is to train the model to recognize my own painting style and then be able to do quick composition sketches and have the AI fill in the textures and shadows. We tried to edit a frame from an animation I've created using Blender and 3d scans of my ceramics (characters and scenario). It was interesting but not very helpful, I needed better lighting in my own scene.

Then we tried the same on an illustration I've done of a tropical forest, watercolor and photoshop combined. The additions can merge really well with the scene.

As an artist who has worked intensively as a professional illustrator, I do have some mixed feelings about these new models. I find them fascinating, mind blowing in terms of expanding creativity possibilities. Especially if combined with the language models such as GPT-3, I see a lot of potential for a new world of independent artists, who at some point will be able to create interactive animated stories that guide the audience through a fictional universe. The AI models could also create custom content based on each user's taste, without a specific artist to guide it.

The work of professional illustrators and photographers has already been going through accelerated changes. I can't see a bright future for selling stock photos and I believe very soon we'll need a more clear protocol to prove a photo was really taken at a certain time with a specific camera, to prevent fakes. Blockchains might become key as proof of provenance. To make a living as an illustrator, it's common to specialize in an easy to recognize style, which also helps the artist work faster - with AI making styles too easy to reproduce, variety or new levels of complexity will become key. It's increasingly common now that illustrators are also graphic designers, each person needs to be capable of more tasks. AI makes it easier for any illustrator to also be an animator, world builder and who knows what else. Working with AI is like cooperating with human collective intelligence.

Yet, for many of us, the adaptations won't be so simple. Early adopters are often rewarded, while many professionals are too busy struggling to pay rent, not having the time to catch up. Also, not only DALLE-2 but many AI models can create styles based on professional artists still working today. One of the most interesting cases in my opinion is James Gurney, who became a trend among the AI art Twitter scene in the last 2 years. I was very happy to see that people like Katherine Crowson (@RiversHaveWings) started tagging Gurney on Twitter and he learned about the AI models powers. He openly enjoyed participating in the creation of AI guided by his art. This exchange of ideas is one the most beautiful examples of technology fulfilling potentials of human collective creation. Yet, I'm not so sure if many other artists would be thrilled with a sudden spike of images very similar to their work, with no compensation, no help to sustain their livelihoods. Copyright laws need serious redesigns, or they might not mean anything at all. Works by an artist like MC Escher are still protected under copyrights laws, but what does that mean when their legacy is part of a guiding style for DALLE-2?

I do think professional artists will have their own advantages when collaborating with AI models. Understanding compositions, colors, storytelling, art historical references, all of it makes a work reach a higher standard. Right now there's a boom in 3D software - not only they became easier to use, but open source alternatives like Blender and all the amazing free youtube tutorials make it so much easier to learn and play with it. There is a whole economy of paid plugins and Blender related services, yet the core is free. So my hope lies in the open source movement, with all the free tools, great learning materials and a lively community. At this moment DALLE-2 is only available for a small group of people, and its owned by a company, OpenAi. It's an enormous responsibility for this company, to release the model and the research around it in a way that helps society flourish instead of proving all the fears around AI to be right.

Arweave TX
oYocStazwfekOAd7HTEazSqJNdbCl5ksmq7tz3svliE
Ethereum Address
0x0f6712c6ac4f02f47cA8b5cf200B224aE6fD8B69
Content Digest
AYLAsdtM090nHWpvWQ13exkaJoNyllkhxa9ffEUPOrg