Nvidia’s proposed method relies on a generative adversarial network, or GAN. It consists of two neural networks that are based on algorithms used in unsupervised machine learning, which in itself pushes artificial intelligence to “learn” through trial and error without human intervention, such as separating images of cats and dogs into two groups.
In this case, one neural network is called the “generator” while the second is the “discriminator.” The generator network creates an image that, to humans, is indistinguishable from the training sample. The discriminator network will then compare the render to the sample, and provide feedback. Ultimately, the generator network will get better at rendering, and the discriminator network will get better at scrutinizing. The final goal is to re-create the render until it “fools” the discriminator network.
Nvidia wanted to expand earlier image-generation attempts including efforts by Google, by both creating higher-quality images and generating a wider variety of computer-generated images in less time. To do that, the researchers created a progressive system. Since A.I. learns more when data is fed into the system, the group added more difficult renderings as the system progressively improved.
The program started with generating low-resolution images of people that don’t actually exist, inspired by all the photos in the database, which are all images of celebrities. As the system improved, the researchers added more layers to the program, adding more fine detail into low-resolution images became 1080p HD standard photos. The result is high-resolution, detailed images of “celebrities” that don’t actually exist in real life.
Along with creating computer-generated images with more resolution — and more impressive detail — the group worked to increase the variation of generated graphics, setting new records for earlier projects for unsupervised algorithms. The research also included new ways of making sure those two generator-discriminator algorithms don’t decide to engage in any “unhealthy competition.” The group also improved the original dataset of celebrity images that it started out with.
Along with generating images of celebrities, the group also used to algorithms on datasets of images of objects, such as a couch, a horse, and a bus.
“While the quality of our results is generally high compared to earlier work on GANs, and the training is stable in large resolutions, there is a long way to true photorealism,” the paper concludes. “Semantic sensibility and understanding dataset-dependent constraints, such as certain objects being straight rather than curved, leaves a lot to be desired.”
While there are still some shortcomings, the group said that photorealism with computer-generated images “may be within reach,” particularly in generating images of fake celebrities.