The 2021 Lester Prize, one of Australia’s premier art prizes for portraiture, was just announced recently.
And that got us thinking: as we embrace artificial intelligence (AI), how long will it be before AI-generated portraits replace hand-drawn ones?
AI-generated portraits are the products of Generative Adversarial Networks (GAN). GAN is an approach to generative modelling using deep learning methods, such as convolutional neural networks. This is a prime example of the technology. Kinda creepy, eh?
Breaking GAN down.
In machine learning, generative modelling is an unsupervised learning task. It automatically discovers and learns the regularities or patterns in the data that is input. This way, the model can then be used to generate or output completely new data that could have been taken from the original dataset.
GAN trains the generative model by framing the problem as a supervised learning task, with two sub-models. The first involves a generator model that is trained to generate new examples by itself. The second is a discriminator model that attempts to distinguish the new examples as either real (from an actual domain/dataset) or fake (purely generated). Together, the two models are trained in a zero-sum game, adversarial, until the discriminator model can no longer discriminate between real or fake – signifying that the generator model is creating plausible examples.
AI-generated portraits are as realistic as they can be.
In one of our HPC Hour talks, Dr Vladimir Puzyrev, Senior Research Fellow at Curtin University, took a deep dive into deep learning and showed us how neural networks can generate realistic, high-definition artificial portraits, paintings, animals and text.
Some of the tools used to generate artificial portraits include NVIDIA’s StyleGAN1, StyleGAN2 and the just-released StyleGAN3.
The rapid progress in AI-enabled portrait generation over just five years using NVIDIA’s StyleGan1. Photo credits: https://arxiv.org/pdf/1406.2661.pdf https://arxiv.org/pdf/1511.06434.pdf https://arxiv.org/pdf/1606.07536.pdf https://arxiv.org/pdf/1710.10196.pdf https://arxiv.org/pdf/1812.04948.pdf
You can see in the image above how quickly the technology has progressed within just five years. Using NVIDIA’s StyleGAN1, the art has gone from a low-resolution black and white grainy portrait in 2014 to a life-like and highly-convincing portrait in 2018.
The image below contains portraits generated using the newer NVIDIA StyleGAN2. It yielded impressive, yet scarily genuine results. Check out this article for more insight into how StyleGAN2 works.
We don’t think The Brady Bunch was computer-generated (after all, computers were only just turning 10 when the sitcom came out), but look at the resemblance!
Portraits generated by StyleGAN2. Photo credits: Connor Shorten/Towards Data Science
As evidenced by these super realistic portraits, machine learning has come a long way. Combining generative adversarial methods with other machine learning techniques can take AI-generated images even further by introducing motion to them. This paper shows how modern AI technologies can generate a fairly realistic video from a single image. Wanna see a talking Mona Lisa? Check out this video!
These techniques are already being used to create mischief. From impersonating journalists on Twitter with AI-generated display pictures to AI tools for creating fake profiles, it’s fascinating yet worrisome how far we’ve come with AI tools in the last few years.