A 22-year-old college student has developed an app that he claims can detect whether text is written by ChatGPT, the explosive chatbot raising plagiarism fears in academia.
Edward Tian, a senior at Princeton University, developed GPTZero while on summer vacation. It had 30,000 views within the week of its release.
Tian said the motivation was to address the use of artificial intelligence to evade anti-plagiarism software to cheat on exams with fast and credible academic writing.
His initial tweet, which claimed the app could “quickly and efficiently” detect whether an essay had been written by artificial intelligence, went viral with more than 5 million views.
Streamlit, the free platform that hosts GPTZero, has since supported Tian with hosting and memory capabilities to keep up with web traffic.
To determine if the text was written by artificial intelligence, the app tests a calculation of “stump,” which measures the complexity of a text, and “bursts,” which compares the variation of sentences.
The more familiar the text is to the bot, which is trained on similar data, the more likely it is to be generated by the AI.
Tian told subscribers that the newer model used the same principles, but with an improved ability to detect artificial intelligence in text.
“When testing the new model in a data set of BBC news articles and AI-generated articles from the same headline prompts, the improved model has a false positive rate of <2%,” he said.
“In the coming months, I will be fully focused on building GPTZero, improving the capabilities of the model, and fully scaling the application.”
Toby Walsh, Scientia Professor of Artificial Intelligence at the University of New South Wales, was not convinced.
He said that unless the app was acquired by a major company, it was unlikely to have an impact on ChatGPT’s ability to plagiarize.
“It’s always an arms race between the technology to identify the synthetic text and the applications,” he said. “And it’s easy enough to ask ChatGPT to rewrite in a nicer style…like rephrase like an 11 year old.
“This will make it more difficult, but it won’t stop it.”
Walsh said users could also ask ChatGPT to add more “randomness” to the text to evade censors and obfuscate with different synonyms and grammatical edits.
Meanwhile, he said that each application developed to detect synthetic text gave artificial intelligence programs greater ability to evade detection.
And every time a user logged into ChatGPT, it generated human feedback to improve the filters, both implicitly and explicitly.
“There is a deep and fundamental technical reason why we will never win the arms race,” Walsh said.
“Every program used to identify synthetic text can be added to [the original program] generate synthetic text to fool them… it’s always like that.
“We are training him, but he is improving day by day.”
GPTZero users have cited mixed results.
“It seemed to be working, and it works for text generated by GPT models entirely or generated with semi-human intervention,” wrote one subscriber.
“However…it doesn’t work well with essays written by good writers. Falsely marked as many essays as written by IA.
“This is at the same time a very useful tool for teachers and, on the other hand, a very dangerous tool: relying too much on it would lead to the exacerbation of false flags.”
“Nice try, but ChatGPT is very good at what it does,” another subscriber wrote.
“I pasted about 350 French words…most of them generated by ChatGPT. The text is lightly hand-edited for style and is generated with strong, forced context leading to the presence of proper nouns.
“That text passes the GPTZero test as a human… I’m not entirely convinced that proper human-AI cooperation can be flagged.”