OpenAI's tool to detect AI-generated text sucks

On Tuesday, OpenAI, the company behind the viral chatbot ChatGPT, threw a tool which detects if a piece of text was written by AI or by a human being. Unfortunately, it’s only accurate about 1 out of 4 times.

“Our classifier is not completely reliable,” the company wrote in a blog post on its website. “They were doing [it] publicly available for feedback on whether imperfect tools like this are useful.”

OpenAI claimed that its detection tool correctly identifies 26% of AI-written text as “probably AI-written” and incorrectly labels human-written text as AI-written 9% of the time.

Since its launch in November, ChatGPT has become very popular around the world for answering all kinds of questions with seemingly intelligent answers. Last week was reported that ChatGPT had passed the final exam for the Wharton School MBA program at the University of Pennsylvania.

The bot has raised concerns, especially among academics, who are concerned about high school and college students using it to do homework and complete assignments. Recently, a 22-year-old Princeton senior became the darling from professors everywhere after he created a website that can detect if a piece of writing was created using ChatGPT.

OpenAI seems aware of the problem. “We are engaging with educators across the US to learn what they see in their classrooms and discuss the capabilities and limitations of ChatGPT, and we will continue to expand our reach as we learn,” the company wrote in its announcement.

Still, by OpenAI’s own admission and BuzzFeed News’ completely unscientific testing, no one should rely solely on the company’s detection tool just yet, because somehow…it blows up.

We asked ChatGPT to type 300 words each about Joe Biden, Kim Kardashian, and Ron DeSantis, then used OpenAI’s own tool to detect if an AI had typed the text. We got three different results: The tool said the article on Biden was “highly unlikely” to be AI-generated and the article on Kardashian was “possibly” generated by AI. The tool was “unclear” about whether the ChatGPT-generated article about DeSantis was AI-generated.

Other people who played with the detection tool also noticed that it was crashing quite spectacularly. When Intercept’s Sam Biddle pasted in a snippet of text from the Bible, the OpenAI tool said it was “likely” AI generated.