The AI Pulse
Posts
🧠 AI vs. Itself: Detecting AI-Generated Text

🧠 AI vs. Itself: Detecting AI-Generated Text

PLUS: How OpenAI Tried and Failed to Watermark ChatGPT Outputs

Rohun Shroff
March 10, 2024

Subscribe | Contact | Meet The Team

Welcome back AI prodigies!

In today’s Sunday Special:

🕵️‍♂️Can AI Detect Itself?
🦾Specialists > Generalists?
👨‍⚖️You Be The Judge
🔑Key Takeaway

Read Time: 6 minutes

🎓Key Terms

Large Language Models (LLMs): AI models pre-trained on vast amounts of data to generate and classify text, answer questions conversationally, and translate languages.
Cryptography: using mathematical operations to hide or encode information so only the intended audience can read it.
Machine Learning (ML): enables computers to learn from data and make decisions or predictions without being explicitly programmed to do so.

🕵️‍♂️CAN AI DETECT ITSELF?

Several AI detection tools have emerged as the line between human-written and AI-generated text has become increasingly blurred. Like Large Language Models (LLMs), generative AI detection capabilities are still developing. Some detectors claim accuracy rates up to 90%, but performance needs to be more independently validated and scientifically sound.

One study in the International Journal of Educational Integrity (IJEI) tested the accuracy of five detection tools—OpenAI’s Classifier, CrossPlag, Copyleaks, and Writer. Researchers asked OpenAI's GPT-3.5 Turbo and GPT-4 models to “write around 100 words on the application of cooling towers in the engineering process.” They used five human-written paragraphs as a control group. They scored the tools on four standard performance metrics:

Specificity: the proportion of human-generated content correctly identified among all the content.
Sensitivity: the proportion of AI-generated content correctly identified among all the content.
Negative Predictive Value (NPV): the likelihood that content identified as human-written is actually human-written.
Positive Predictive Value (PPV): the likelihood that content identified as AI-generated is actually AI-generated.

All detectors failed to identify human-written content (i.e., NPV) at an acceptable rate, except Open AI’s Classifier. On the other hand, detecting AI-generated text (i.e., PPV) was far more accessible for every tool except Writer. We’ve excluded the exact accuracy metrics, as the small sample size and short text length of 100 words limit their validity. Despite decent detection ability, the war between conversational chatbots and detectors will continue. Across all detectors, accuracy significantly declined when classifying GPT-3.5 Turbo versus GPT-4 outputs. In fact, OpenAI discontinued its classification tool after correctly identifying just 26% of AI-written texts, or less than 200 words, as “likely AI-written.”

🦾SPECIALISTS > GENERALISTS?

Specialized detection tools show more significant potential. A Machine Learning (ML) tool called “ChatGPT Detector” correctly identifies AI-written chemistry research papers by reading an abstract or introductory section. After being trained on just 200 GPT-3.5 Turbo outputs and 100 published articles, the tool correctly identified 99% of OpenAI’s ChatGPT-written pieces. It also identified research articles generated by GPT-4. However, before “ChatGPT Detector” can become academia’s preeminent plagiarism detector, others must replicate this experiment by manipulating the following variables:

Length
Complexity
Writing Styles for Human-Written and Chatbot-Generated Papers
Types of Chatbots (i.e., Generalized vs. Specialized)
The Academic Fields of Papers

Developers also fine-tune detection tools to look for human text or AI text or accept false positives (i.e., classifying human text as AI-generated) over false negatives (i.e., classifying AI text as human-generated) and vice versa. Copyleaks boast a false positive rate of just 0.2%, trained to look for human text, which has more discernible features like syntax errors, slang use, and synthesizing analyses of complex topics.

Regardless of how specialized, well-trained, or fine-tuned a detection tool is, it will never be 100% accurate. That’s why OpenAI is trying to embed hidden watermarks throughout each ChatGPT output. Derived from cryptography, these signals would be impossible for humans to identify. For example, ChatGPT might place a specific letter every 20 words. In a more complex example, three-letter words follow two-letter words 4% of the time. Presently, these techniques only work on longer pieces of text, and they’re not ready for public use. But pragmatists know that workarounds will always exist. For instance, a user could copy and paste the watermarked output into a watermark-less chatbot for modification. The cat-and-mouse game will proceed in perpetuity, rendering all detection methods imperfect. Therefore, we must use our judgment first and then turn to detection tools for more insight.

👨‍⚖️YOU BE THE JUDGE

Current human detection abilities need improvement. In a Cornell study, people thought 66% of news articles generated by GPT-2 were credible. Another Cornell study found that detection abilities were no better than random chance. A 1,000-person (i.e., unscientific) survey yielded a success rate of just 57%.

We can stay ahead of the word prediction machines by focusing on a few factors. Here are the top 5 characteristics of AI-generated text:

Perfect Syntax
False Statements of Basic Fact
Basic Word Choice (i.e., No Slang or Technical Jargon)
Word Repetition, Especially With Complex Topics
Generic or Matter-Of-Fact Tone

Now, put your skills to the test with this blurb: “The AI Pulse is a free daily newsletter reporting the latest trends, tools, and tips. It is written by Rohun and James, two USC students. AI is arguably the most significant technological invention since the internet. Rohun and James will help you learn about it.”

🩺 PULSE CHECK

Was this paragraph AI-generated?

Vote Below to View Answer

🔑KEY TAKEAWAY

Although humans will almost certainly stay one step ahead of detection tools, purely AI-generated text won’t change the world. As long as it’s generated without proprietary data or novel prompting techniques, anyone can replicate it. At best, AI-generated text can produce something mediocre or help someone make a quick buck, but it fails to provide long-term academic, artistic, or commercial value for now.

📒FINAL NOTE

If you found this useful, follow us on Twitter or provide honest feedback below. It helps us improve our content.

How was today’s newsletter?

❤️AI Pulse Review of The Week

“Excellent content and well-researched articles. I recommend this newsletter to everyone!”

-Vince (⭐️⭐️⭐️⭐️⭐️Nailed it!)

🎁NOTION TEMPLATES

🚨Subscribe to our newsletter for free and receive these powerful Notion templates:

⚙️150 ChatGPT prompts for Copywriting
⚙️325 ChatGPT prompts for Email Marketing
📆Simple Project Management Board
⏱Time Tracker