Skip to main content

5 things AI image generators still struggle with

AI image generators like Dall-E, Stable Diffusion, Midjourney, and Bing Image Creator produce amazing results, but sometimes they can be incredibly frustrating. With simple prompts containing just a few words, an AI can output impressive images that appear to be professional photographs and convincing art in various styles. However, the same prompt will occasionally create some horrific creature or hilariously flawed rendering.

Negative prompts might help reduce the likelihood of these errors, but complexity can’t always save you. Even AI experts struggle with misshapen creatures and unworldly scenes, requiring long hours of refining prompts or touching-up images with a traditional photo editor. For the time being, if you look carefully in the right areas of an image, there’s a good chance you’ll be able to identify if it was made by a machine.

Hand salad and balls of fingers

AI developers have made progress in the struggle to teach artificial intelligence tools how human hands should look, but there’s plenty of room for improvement. If fingers aren’t featured prominently, it’s easy to miss errors, but it’s an ongoing problem.

Dall-E was an early AI leader but hands are not its thing.
Dall-E was an early AI leader, but hands are not its thing. Dall-E prompted by Alan Truly

One of the first and best AI image generators available to the public, OpenAI’s Dall-E, created these pictures of people holding hands. At first glance, it might look fine. On closer inspection, some problems become apparent. Beware of extra fingers, weird fingernails, and merged digits.

Complicated grips and interlaced fingers are even more challenging. Don’t be surprised if your AI images come back with classic glitches referred to as “hand salad” or “balls of fingers.”

Dall-E's interlaced hands are disturbing.
Dall-E’s interlaced hands are disturbing. Dall-E prompted by Alan Truly

Troubling text and writing

You might expect that text would be easy for a computer to generate. You see evidence of words on screens daily when you pick up the phone or open a browser. Early computers, unlike the top gaming PCs of today, couldn’t display graphics of any kind. Everything was text or numbers.

Leonardo AI knows styles but printed text is challenging.
Leonardo AI knows styles, but printed text is challenging. Leonardo AI prompted by Alan Truly

Yet displaying actual letters and symbols as printed or written words is surprisingly tricky for an AI image generator. It might sound like an easy problem to solve, but it isn’t. An app can’t just overlay plain text. To be convincing, the text style, shading, angle, and perspective must match the rest of the scene.

In the example, a relatively new AI image generator, Leonardo AI, made a valiant effort with a vintage billboard for Jack Rabbit Slim’s diner. After multiple tries, the AI managed to spell out “Jack Rabbit’s,” which is quite close to the request. The vintage photograph style was spot-on in each image, but the letters and words were mostly flawed.

Leonardo AI came close to getting text right in one of these renders.
Leonardo AI came close to getting text correct in the render on the left. Leonardo AI renders prompted by Alan Truly

The eyes don’t have it

Bing Image Creator struggles with eyes.
Bing Image Creator prompted by Alan Truly

It’s often said that the eyes are the windows to the soul. We rely so much on eye contact that it could be the most critical detail in creating a realistic portrait. But many AI tools have difficulty rendering human eyes.

Bing Image Creator did a decent job with the studio background and posing a multigenerational family photo. However, almost every person has bizarre eyes that look like they’ve been inserted by aliens, or perhaps these smiling people are in the process of transforming into unearthly creatures.

Two closer examples of Bing Image Creator's eye issues.
Two closer examples of Bing Image Creator’s disturbing eye issues. Bing Image Creator prompted by Alan Truly

Troublesome tools

Humans are great with tools and not only the digital variety like AI. We quickly master any physical tool within our grasp. An AI, on the other hand, struggles to understand what they are and how they’re used.

Midjourney understands hands but is puzzled by wrenches.
Midjourney understands hands, but is puzzled by wrenches. Is that a light bulb at the bottom left? Midjourney prompted by Alan Truly

Midjourney is an AI image generator that’s making fantastic progress in solving problems with human faces and hands. However, when prompted to show a mechanic tightening a bolt with a wrench, the tool is entirely absent. Fingernails are added to gloves in one case, and a light bulb somehow appears in another.

Scissors are too complicated for Bing Image Creator in this closeup render of hair being cut. They are only open in one image and never appear to be in the act of cutting.

Bing Image Creator can't figure out scissors.
Bing Image Creator can’t figure out how scissors work. Bing Image Creator prompted by Alan Truly

Nightmare teeth

Stable Diffusion renders of smiles sometimes have too many teeth.
Stable Diffusion via Leonardo AI, prompted by Alan Truly

When people smile and laugh, that usually improves a picture, making it pleasant and fun. When given a simple prompt like two students smiling and laughing, an AI can turn this into nightmare fuel with multiple rows of teeth and other strange distortions.

Leonardo AI allows you to choose between several models, and some handle teeth well. The popular Stable Diffusion 2.1 model needed some help to get teeth right. With some negative prompting, the issue was resolved. There are solutions to these AI image problems, but it still takes work to get good results.

Stable Diffusion smiles benefit from negative prompts.
Stable Diffusion smiles benefit from negative prompts to remove “weird teeth” and “distorted mouth.” Stable Diffusion via Leonardo AI, prompted by Alan Truly

AI art is improving rapidly

In the early days of AI art, the results were weird and wonderful, creating beauty and horror with equal abandon. The errors are becoming less noticeable with each new update, and many problems can be overcome with some refinement.

With so many AI tools available, it’s easy to try another system. Many AI image generators allow negative prompts or other options to adjust the algorithm and get better results.

You may need to run through several attempts to get a usable picture, particularly if there’s a focus on faces or hands. When you want to include print or written words, be prepared to spend time in an image editor erasing the AI’s nonsense letters and blending in the correct text.

The good news is that many AI image generators are free, and subscription models are relatively inexpensive. Within a year, these lingering problems could be resolved, allowing you to use an AI render as a finished art piece or a replacement for a photograph.

Alan Truly
Computing Writer
Alan is a Computing Writer living in Nova Scotia, Canada. A tech-enthusiast since his youth, Alan stays current on what is…
I’ve seen the (distant) future of AI web search – here’s where it’s amazing, and where it struggles
Bing copilot AI chat interface.

The aggressiveness with which artificial intelligence (AI) moved from the realm of theoretical power into real-world consumer-ready products is astonishing. For several years now, and up until a couple of months ago when OpenAI's ChatGPT broke onto the scene, companies from the titans of Microsoft and Google down to myriad startups espoused the benefits of AI with little practical application of the tech to back it up. Everyone knew AI was a thing, but most didn't actually utilize it.

Just a handful of weeks after announcing an investment in OpenAI, Microsoft launched a publicly-accessible beta version of its Bing search engine and Edge browser powered by the same technology that has made ChatGPT the talk of the town. ChatGPT itself has been a fun thing to play with, but launching something far more powerful and fully integrated into consumer products like Bing and Edge is an entirely new level of exposure for this tech. The significance of this step cannot be overstated.
ChatGPT felt like a toy; having the same AI power applied to a constantly-updated search database changes the game.
Microsoft was kind enough to provide me with complete access to the new AI "copilot" in Bing. It only takes a few minutes of real-world use to understand why Microsoft (and seemingly every other tech company) is excited about AI. Asking the new Bing open-ended questions about planning a vacation, setting up a week of meal plans, or starting research into buying a new TV and having the AI guide you to something useful, is powerful. Anytime you have a question that would normally require pulling information from multiple sources, you'll immediately streamline the process and save time using the new Bing.
Let AI do the work for you
Not everyone wants to show up to Google or Bing ready to roll up their sleeves and get into a multi-hour research session with lots of open tabs, bookmarks, and copious amounts of reading. Sometimes you just want to explore a bit, and have the information delivered to you -- AI handles that beautifully. Ask one multifaceted question and it pulls the information from across the internet, aggregates it, and serves it to you in one text box. If it's not quite right, you can ask follow-up questions contextually and have it generate more finely-tuned results.

Read more
Forget Dall-E, you can sign up to create AI-generated videos now
A frame from an AI-generated video in claymation style.

Dall-E, ChatGPT, and other AI-generation technologies continue to amaze us. Still, AI image-generation tools like Midjourney might seem boring once you see the new, AI-powered video-generation abilities that will soon be available to us all.

Runway provides an advanced online video editor that offers many of the same features as a desktop app. The company has distinguished its service from others, however, by pioneering the use of AI tools that help with various time-consuming video chores, such as masking out the background.

Read more
I asked AI to recreate the best Super Bowl commercials — to hilarious and horrifying effect
ask ai recreate best super bowl commercials featured image

AI is showing up in every facet of life in 2023, from AI-powered chatbots like ChatGPT to an endless stream of image-generation tools. But I wanted to see what AI's take would be on something iconic: the best Super Bowl commercials of all time.

The big game, one of the biggest advertising events each year, is mere days away. I pulled prompts from iconic Super Bowl ads and fed them into DreamStudio (an image generation tool powered by Stable Diffusion) and came away with equally hilarious and horrifying results.
Apple - '1984'

Read more