ChatGPT first announced its advanced Voice Mode back in May during its Spring Update, and it’s been rather quiet on the rollout. But in a new post on X (formerly Twitter), OpenAI has delivered an update on the situation, indicating when it will finally become available more broadly.
According to the announcement, an official rollout won’t happen until “this fall,” clarifying that “exact timelines depend on meeting our high safety and reliability bar.”
A smaller alpha release, however, will come in late July. The post admits that it’s coming a bit late, stating: “We had planned to start rolling this out in alpha to a small group of ChatGPT Plus users in late June, but need one more month to reach our bar to launch.”
We're sharing an update on the advanced Voice Mode we demoed during our Spring Update, which we remain very excited about:
We had planned to start rolling this out in alpha to a small group of ChatGPT Plus users in late June, but need one more month to reach our bar to launch.…
— OpenAI (@OpenAI) June 25, 2024
Some will be disappointed to hear that the delay is official, especially since it was supposed to roll out “in the coming weeks,” as was stated in May. One X user was upset that OpenAI teased people into signing up for the paid ChatGPT Plus subscription, despite the fact that it’s taken months to roll out the feature. Even so, an official acknowledgment is always appreciated for those desperately waiting.
In the meantime, the post specifies that OpenAI is continuing to make improvements to the overall system: “For example, we’re improving the model’s ability to detect and refuse certain content. We’re also working on improving the user experience and preparing our infrastructure to scale to millions while maintaining real-time responses. We are also working on rolling out the new video and screen sharing capabilities we demoed separately, and will keep you posted on that timeline.”
The advanced Voice Mode made quite an impression when it was revealed in May, showcasing a latency-free, human-like response time — complete with emotions such as laughing. The demo even allowed the user to cut the AI off mid-sentence, while retaining continuity in the conversation.