Skip to main content

MIT and IBM’s new A.I. image-editing tool lets you paint with neurons

Whether it’s automatically tagging objects in pictures or the ability to tweak lighting and separate subjects from their background using the iPhone’s “portrait mode,” there’s no doubt that artificial intelligence is a powerful force in modern photo-editing tools.

But what if it were possible to go one step further, and use the latest cutting-edge technologies to develop what may just be the world’s most ambitious (and, in its own way, imaginative) paint program — one that goes far beyond simply touching up or coldly analyzing your existing pictures?

With such a program, all a person would need to do to remove an unsightly line of cars sullying a picture of their family home would be to pass over it with a brush. As if by magic, the vehicles would be replaced by a photorealistic grassy bank. Want to eliminate that photobomber from one of your vacation snaps? No problem: Just click to select them and they’ll vanish in place of a utility pole that looks like it’s always been there. How about adding an authentically ancient door into a photo of an old church? Click and it’s done. You get the idea.

Editing Images with Neural Networks

This is what researchers at Massachusetts Institute of Technology and IBM are working toward with an amazing new tech demonstration they call the “GAN Paint Studio.” Described by its creators as providing the ability to “paint with neurons” — referring to the artificial neurons of a machine learning neural network — it’s one of the most potentially transformative photo-editing tools yet created.

It allows users to upload an image of their choosing and then modify any aspect of it they want, whether that’s changing the size of objects or adding in completely new items and objects. Think of it as Photoshop for the “deepfake” generation, albeit one that’s currently more of a proof-of-concept than a finished product.

The future of creative tools

“What we created with this work is a starting point to show how creative tools in the future could work,” Hendrik Strobelt, a research scientist at the MIT-IBM Watson A.I. Lab, told Digital Trends. “We started from a neural network [called a] GAN that can produce its own images of a certain category — for example, kitchen images — and analyzed which internal parts of the network are responsible for producing which feature. This allowed us to modify the images that the network produced. We ‘drew’ on them. The novelty we added is that you can upload your own image of this category and modify it with brushes that do not just draw strokes, but actually draw semantically meaningful units — such as trees, brick-texture, or domes.”

A GAN, or Generative Adversarial Network, is one of the most powerful tools used in generative artificial intelligence. A GAN pits two artificial neural networks against one another. One network generates new images, while the other attempts to work out which images are computer-generated and which are not. Over time, this generative adversarial process causes the “generator” network to become good enough at creating images that it can successfully fool the “discriminator” every time. A GAN was the technology behind the A.I. artwork that famously sold for big bucks at a Christie’s auction in 2018.

The system developed by the MIT and IBM researchers showcases some neat abilities. A bit like Deep Dream, the trippy image-generating tool developed by Google researchers several years back, it shows an impressive understanding of which images fit together. As a result of being trained on a vast archive of images, it picks up an understanding of the basic rules governing relationships between objects. For instance, ask it to add an object in the sky and it won’t draw a window — since it knows that windows aren’t usually (or ever) found there.

As Strobelt notes, GAN Paint Studio is not quite ready for prime time just yet. Although members of the public can have a go at using it, there’s still more work to be done. Notably, the demonstration version is currently low-resolution. However, it does showcase the immense promise of the technology.

Challenging imagination

“The most fun parts [of the technology] are actually when your imagination is challenged,” Strobelt said. “Try adding a door to the Palazzo Vecchio image; it’s kind of mind-blowing if you know the place. The system is far from perfect, and not every image can be modified equally well. There is still research needed on how to optimize all the parts. For example, when the GAN model tries to represent the input model, it might very well use the wrong semantical units to reproduce features — it [may] just generate a door out of tree units. Figuring out when and how it does do right or wrong is actually very interesting future work.”

“I see this as an advanced tool to help humans who think they are not creative to challenge this thought.”

Just as GANs get better over time, so Strobelt thinks that the applications for GAN Paint Studio will open up. “The obvious first idea would be a photo editor with these semantic brushes and erasers,” he said. “This could help you edit vacation photos, for example. It could also allow architects to quickly create variations on the embedding of their building renderings. Game designers could [also use it to] modify level maps quicker.”

If such technology could be added to video effects, it would also prove immensely powerful. This would allow objects to be placed into shots with just the touch of a button. Should a director realize they’ve forgotten to include a background item that’s crucial to the plot in a completed scene, it could be quickly added in — without the need for the current expensive and time-consuming visual effects processes.

Strobelt is decisive in saying that he doesn’t think GAN Pain Studio is truly, autonomously creative. “No,” he said, decisively. “I see this as an advanced tool to help humans who think they are not creative to challenge this thought.”

Then again, what is creativity? As with many other aspects of our lives, such as the jobs we believe only humans can do, it seems that A.I. is ready to ask the big questions.

Editors' Recommendations

Luke Dormehl
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
Fake news? A.I. algorithm reveals political bias in the stories you read
newspaper stack

Here in 2020, internet users have ready access to more news media than at any other point in history. But things aren’t perfect. Click-driven ad models, online filter bubbles, and the competition for readers’ attention means that political bias has become more entrenched than ever. In worst-case scenarios, this can tip over into fake news. Other times, it simply means readers receive a slanted version of events, without necessarily realizing that this is the case.

What if artificial intelligence could be used to accurately analyze political bias to help readers better understand the skew of whatever source they are reading? Such a tool could conceivably be used as a spellcheck- or grammar check-type function, only instead of letting you know when a word or sentence isn’t right, it would do the same thing for the neutrality of news media -- whether that be reporting or opinion pieces.

Read more
Filter by positivity: This new A.I. could detoxify online comment threads
carnegie mellon help speech project female smartphone generic getty

How do you solve a problem like the internet? It’s a question that, frankly, would have made little sense even a quarter of a century ago. The internet, with its ability to spread both information and democratic values to every far-flung corner of the Earth, was the answer.

Asking for a cure for the internet was like asking for a cure for the cure for cancer. Here in 2020, the picture is a bit more muddied. Yes, the internet is astonishingly brilliant for all sorts of things. But it also poses problems, from the spread of fake news to, well, the digital cesspit that is every YouTube comments section ever. To put it another way, the internet can be all kinds of toxic. How do we clean it up?

Read more
Neuro-symbolic A.I. is the future of artificial intelligence. Here’s how it works
IBM Watson Shapes

Picture a tray. On the tray is an assortment of shapes: Some cubes, others spheres. The shapes are made from a variety of different materials and represent an assortment of sizes. In total there are, perhaps, eight objects. My question: “Looking at the objects, are there an equal number of large things and metal spheres?”

It’s not a trick question. The fact that it sounds as if it is is proof positive of just how simple it actually is. It’s the kind of question that a preschooler could most likely answer with ease. But it’s next to impossible for today’s state-of-the-art neural networks. This needs to change. And it needs to happen by reinventing artificial intelligence as we know it.

Read more