OpenAI may have secret watermarking tool for ChatGPT

chatGPT on a phone on an encyclopedia — Shantanu Kumar / Pexels

ChatGPT plagiarists beware, as OpenAI has developed a tool that is capable of detecting GPT-4‘s writing output with reportedly 99.99% accuracy. However, the company has spent more than a year waffling over whether or not to actually release it to the public.

The company is reportedly taking a “deliberate approach” due to “the complexities involved and its likely impact on the broader ecosystem beyond OpenAI,” per TechCrunch. “The text watermarking method we’re developing is technically promising, but has important risks we’re weighing while we research alternatives, including susceptibility to circumvention by bad actors and the potential to disproportionately impact groups like non-English speakers,” an OpenAI spokesperson said.

Recommended Videos

The text-watermarking system works by incorporating a specific pattern into the model’s written output that’s detectable to the OpenAI tool ,but invisible to the end user. While this tool can reliably spot the writing generated by its own GPT-4 engine, it cannot detect the outputs of other models like Gemini or Claude. What’s more, the watermark itself can be removed by running the text output through Google Translate, shifting it to another language and then back.

This isn’t OpenAI’s first attempt at building a text-detection tool. Last year, it quietly axed a similar text detector it had in development due to the tool’s paltry detection rate and propensity for false positives. Released in January 2023, that detector needed a user to manually input sample text at least 1,000 characters in length before it could make a determination. It managed to correctly classify AI generated-content with only 26% accuracy and labeled human-generated content as AI-derived 9% of the time. It also led one Texas A&M professor to incorrectly fail an entire class for supposedly using ChatGPT on their final assignments.

OpenAI is also reportedly hesitant to release the tool for fear of a user backlash. Per the Wall Street Journal, 69% of ChatGPT users believe that such a tool would be unreliable and likely result in false accusations of cheating. Another 30% reported they would willingly drop the chatbot in favor of a different model should OpenAI actually roll out the feature. The company also fears developers would be able to reverse engineer the watermark and build tools to negate it.

Even as OpenAI debates the merits of releasing its watermarking system, other AI startups are rushing to release text detectors of their own, including GPTZero, ZeroGPT, Scribbr, and Writer AI Content Detector. However, given their general lack of accuracy, the human eye remains our best method of spotting AI-generated content, which is not reassuring.

Editors’ Recommendations