Skip to main content

ChatGPT’s new upgrade finally breaks the text barrier

OpenAI is rolling out new functionalities for ChatGPT that will allow prompts to be executed with images and voice directives in addition to text.

The AI brand announced on Monday that it will be making these new features available over the next two weeks to ChatGPT Plus and Enterprise users. The voice feature is available in iOS and Android in an opt-in capacity, while the images feature is available on all ChatGPT platforms. OpenAI notes it plans to expand the availability of the images and voice features beyond paid users after the staggered rollout.

OpenAI image prompt.
Twitter/X

The voice chat functions as an auditory conversation between the user and ChatGPT. You press the button and say your question. After processing the information, the chatbot gives you an answer in auditory speech instead of in text. The process is similar to using virtual assistants such as Alexa or Google Assistant and could be the preamble to a complete revamp of virtual assistants as a whole. OpenAI’s announcement comes just days after Amazon revealed a similar feature coming to Alexa.

Recommended Videos

To implement voice and audio communication with ChatGPT, OpenAI uses a new text-to-speech model that is able to generate “human-like audio from just text and a few seconds of sample speech.” Additionally, its Whisper model can “transcribe your spoken words into text.”

OpenAI says it’s aware of the issues that could arise due to the power behind this feature, including, “the potential for malicious actors to impersonate public figures or commit fraud.”

This is one of the main reasons the company plans to limit the use of its new features to “specific use cases and partnerships.” Even when the features are more widely available they will be accessible mainly to more privileged users, such as developers.

ChatGPT can now see, hear, and speak. Rolling out over next two weeks, Plus users will be able to have voice conversations with ChatGPT (iOS & Android) and to include images in conversations (all platforms). https://t.co/uNZjgbR5Bm pic.twitter.com/paG0hMshXb

— OpenAI (@OpenAI) September 25, 2023

The image feature allows you to capture an image and input it into ChatGPT with your question or prompt. You can use the drawing tool with the app to help clarify your answer and have a back-and-forth conversation with the chatbot until your issue is resolved. This is similar to Microsoft’s new Copilot feature in Windows, which is built on OpenAI’s model.

OpenAI has also acknowledged the challenges of ChatGPT, such as its ongoing hallucination issue. When aligned with the image feature, the brand decided to limit certain functionalities, such as the chatbot’s “ability to analyze and make direct statements about people.”

ChatGPT was first introduced as a text-to-speech tool late last year; however, OpenAI has quickly expanded its prowess. The original chatbot based on the GPT-3 language model has since been updated to GPT-3.5 and now GPT-4, which is the model that is receiving the new feature.

When GPT-4 first launched in March, OpenAI announced various enterprise collaborations, such as Duolingo, which used the AI model to improve the accuracy of the listening and speech-based lessons on the language learning app. OpenAI has collaborated with Spotify to translate podcasts into other languages while preserving the sound of the podcaster’s voice. The company also spoke of its work with the mobile app, Be My Eyes, which works to aid blind and low-vision people. Many of these apps and services were available ahead of the images and voice update.

Fionna Agomuoh
Fionna Agomuoh is a Computing Writer at Digital Trends. She covers a range of topics in the computing space, including…
The best AI chatbots to try: ChatGPT, Gemini, and more
Bing Chat shown on a laptop.

The idea of chatbots has been around since the early days of the internet. But even compared to popular voice assistants like Siri, the generated chatbots of the modern era are far more powerful.

Yes, you can converse with them in natural language. But these AI chatbots can generate text of all kinds, from poetry to code, and the results really are exciting. ChatGPT remains in the spotlight, but as interest continues to grow, more rivals are popping up to challenge it.
OpenAI ChatGPT and ChatGPT Plus

Read more
Microsoft Copilot: how to use this powerful AI assistant
Man using Windows Copilot PC to work

In the rapidly evolving landscape of artificial intelligence, Microsoft's Copilot AI assistant is a powerful tool designed to streamline and enhance your professional productivity. Whether you're new to AI or a seasoned pro, this guide will help you through the essentials of Copilot, from understanding what it is and how to sign up, to mastering the art of effective prompts and creating stunning images.

Additionally, you'll learn how to manage your Copilot account to ensure a seamless and efficient user experience. Dive in to unlock the full potential of Microsoft's Copilot and transform the way you work.
What is Microsoft Copilot?
Copilot is Microsoft's flagship AI assistant, an advanced large language model. It's available on the web, through iOS, and Android mobile apps as well as capable of integrating with apps across the company's 365 app suite, including Word, Excel, PowerPoint, and Outlook. The AI launched in February 2023 as a replacement for the retired Cortana, Microsoft's previous digital assistant. It was initially branded as Bing Chat and offered as a built-in feature for Bing and the Edge browser. It was officially rebranded as Copilot in September 2023 and integrated into Windows 11 through a patch in December of that same year.

Read more
ChatGPT’s new Canvas feature sure looks a lot like Claude’s Artifacts
ChatGPT's Canvas screen

Hot on the heels of its $6.6 billion funding round, OpenAI on Thursday debuted the beta of a new collaboration interface for ChatGPT, dubbed Canvas.

"We are fundamentally changing how humans can collaborate with ChatGPT since it launched two years ago," Canvas research lead Karina Nguyen wrote in a post on X (formerly Twitter). She describes it as "a new interface for working with ChatGPT on writing and coding projects that go beyond simple chat."

Read more