As mentioned in my last post, me and my partner Gene Kogan have been experimenting with illustrating a short sci-fi story I’ve written and recently published with Untitled Frontier. The story evolved from a project I started in 2020, the first time I went to Mars. Not so far as the planet Mars, but in a little part of the Sonoran desert where a bunch of friends created an art community/ R&D lab we call Mars. After all, if our technology could one day take some humans to the red planet, we could just as well use it to try to inhabit the harsh parts of nowadays Earth. A high tech, low cost approach. It’s an exercise in resilience and sustainability, with a lot of space for artistic madness and fun.
I collected local clay and created ceramics with primitive kilns I dug into the ground (in this post one can read more about the process!). Then I became fancy, got access to professional ceramic studios and experimented with a variety of high end clays and glazes. I learned to transform the sculptures into 3d digital objects using a technique known as photogrammetry, so I started the long path of learning 3d softwares and animations, but I'll talk about this some other time. For this post, I'd like to discuss the process of training an AI model, called GAN, on photos from my own sculptures. By the way, I call them Little Martians, and now I'm creating a whole sci-fi universe around them.
For those of you who do not know, GAN stands for Generative Adversarial Network. A very short simplified explanation of what GANs are: statistics and computer vision models capable of creating images similar to a dataset - in this case, the dataset is a collection of JPEGs. The more similar looking these JPEGS are, the more likely a GAN will be capable of creating an output that looks like them. Best results usually need several thousands images, but as the technology improves, less and less is necessary for GANs to look just like originals. If the dataset has a lot of variety, the output is going to be more abstract and that can be beautiful on its own. There are many different models of GANs, many different properties within the models. They became popular among creative technologists and eventually to people who sell art. Nowadays there are other models just as impressive as GANs, such as the Diffusion family, but GANs are the most well-known outside AI researchers circles.
If someone wants to have a quick feeling of what kind of images, abstract or realistic, can be made with this technology, I recommend using Artbreeder. Otherwise, there are many ways one can train a GAN with a custom dataset, such as photos from your own artwork. One of the easiest, though somewhat expensive, is by using Runway's image training AI tool, which last time I've checked, was using StyleGAN2 ADA model. There are many useful tutorials on how to use Runway, or try the open source alternative, notebooks on Google Colab that can generate images for free. I highly recommend the Artificial Images youtube channel. Below, a photo of one of my ceramic sculptures on the left and a Runway generated AI version of a Little Martian on the right. Here you can see the video of the first Little Martian GAN.
This year, I decided to take it serious and do a real art project using GANs. It's extremely interesting as a photography exercise: I had to take as many pictures as I could from each of the 100 Little Martians sculptures - there are more of them, but I used around a 100. I could play with lighting, angles and take very different images from the same sculpture. I used a tiny 10$ photo booth I bought from Alibaba, candles, a color changing LED lamp and my several years old Sony camera. To be honest, I don't think my professional photographers friends would be very impressed with my set up, one could do far better.
Because my photography setup was far from perfect, I needed to edit the photos to apply a dark background. Usually I do it all on Lightroom, as it's very easy now to select a subject in a photo and apply a mask around it. Unfortunately, it's not as fun to do it with thousands of photos. I'm sure there must be some shortcuts for that on Lightroom, but my partner Gene happens to be extremely good at writing code to automate tasks, so he just offered to write a custom software that would run the AI BASNet model on all my photos, plus it applied a black background, centralized and cropped the images with a calculated variation. The BASNet model is pretty much what Adobe uses when they automatically identify a subject in a picture, though Adobe made its own little improvements - the one we used give us fuzzy ugly edges while Lightroom tended to be more accurate!
For posting, I would retouch these terribly ugly edges, but for training a StyleGAN, we found they didn't matter so much. Gene and I used a StyleGAN3 model, one of the most recent ones, and trained it on Gene's local computer, which is pretty powerful. The model only takes images up to 1024×1024 pixels, very limited for photography standards. The output edges came out a bit more fuzzy than they would normally be, but not much. So we trained the first model on over 5k of my images. Here are the results.
I think they are beautiful, but they were very far from the Little Martians faces. Gene then asked me to select only frontal images and align them by the eye. That proved to be quite some laborious task, because my sculptures are not human enough to have the eyes automatically recognized by AI models. So I decided to do it by hand, using guides on Photoshop. It took me quite some hours to assemble over 2000 images in the right format. Below one can see the training process, from a model previously trained on human faces we do transfer learning to our own dataset.
A very important part of playing with any AI model is working with someone who understands the code. When Gene does his training, that's an art form. He can guide what he wants from the results, human similarity or not, interpolate from any point in the latent space (which is how we call the space of possible images that can be generated with a trained AI), train faster, for longer, change several parameters etc. The first time, Gene put a learning rate that was way too fast - the model collapsed and had to relearn the shapes, one can watch the process in this video. There are many intricacies around playing with the code. Even as the user interface improves and more people become used to collaborating with AI, understanding machine learning will most likely still be a great advantage.
Watch a video animating the latent space of this StyleGAN3!
Many of the outputs present these interesting geometric patterns. They are artefacts from StyleGAN3, while if one check the video from my StyleGAN2 one can see it had a particular texture as well. They remind me a little bit of DeepDream textures, a model created by Alexander Mordvintsev, that gives a very particular aesthetics to images and to 3d objects. I happen to be fascinated with geometrical patterns, so I thought they were a great property of my AI Little Martians. And I can generate as many as I like! My plan now is animate some of these AI creatures, the ones that have a recognizable mouth should work with the automated lip sync animation AI model. Then I can make them tell us stories generated with an AI automated text creator model, called GPT-3! It's a language model that can be guided into a style, as if I could give it a writer's personality.
Of course I'll make it a collaborator in the art of telling Little Martians stories.
As AI tools become easily available for image generation, it's possible to chose any picture and create variations from it. One can also recreate famous artists styles, as I comment on my last post. If one downloads an artist's body of work (one can just scrape their instagram account or website) and train a GAN with it, the outputs can be quite similar to the original artist work. Yet, training your own AI model with a careful curation of your data gives very particular results that would not be part of a wider text2image model or a GAN made out of a few images. AI art collaboration is still in its infancy, I do think that artists will be able to create many new unique styles by merging their own skills and AI powers. Now it's a good time to imagine and experiment what this might look like.