The AI Pulse
Posts
🧠 The Turing Test: Timeless and Thought-Provoking

🧠 The Turing Test: Timeless and Thought-Provoking

PLUS: How a Chatbot From 1966 Beat OpenAI’s GPT-3.5

Rohun Shroff
December 01, 2024

Subscribe | AI Toolkit | Meet The Team

Welcome back AI prodigies!

In today’s Sunday Special:

📦How Does It Work?
🏆The Winner Doesn’t Matter
⚙️How and Why Was It Created?
🔑Key Takeaway

Read Time: 7 minutes

🎓Key Terms

Binary Digit (i.e., “Bit”): The smallest unit of data that a computer can process.
Generative AI (GenAI): Uses AI models trained on text, image, audio, video, or code to generate new content.
Large Language Models (LLMs): AI models pre-trained on vast amounts of data to generate human-like text.
Anthropomorphize: To give human traits, emotions, or behaviors to non-human things, like animals, objects, or machines.

🩺 PULSE CHECK

Could you tell the difference between a human and a chatbot?

Vote Below to View Live Results

📦HOW DOES IT WORK?

“The Turing Test,” Explained.

This summer, OpenAI’s ChatGPT passed “The Turing Test.” British mathematician Alan Turing developed “The Turing Test” in 1950 to measure a computer’s ability to exhibit human-like intelligence.

“The Turing Test” had three participants:

A Computer
A Human Foil
A Human Interrogator

The Human Interrogator attempts to determine which is the Computer and the Human Foil by asking a series of questions through a keyboard. If the Computer can consistently fool the Human Interrogator, it’s considered an intelligent, thinking entity.

“We weren’t even close to passing it in 2021. Then, OpenAI’s ChatGPT passed it,” said former PayPal CEO Peter Thiel. “That was the Holy Grail of AI research for the previous 60 years.”

How Do You Pass It?

But who’s the human interrogator? What’s required to pass it? Turing never answered these questions because “The Turing Test” wasn’t meant to be a benchmark for comparing the performance of different AI models.

However, in his landmark study in 1950 titled “Computing Machinery and Intelligence,” he predicted that by 2000, machines with a capacity of 1 billion Bits of memory would beat a Human Interrogator 30% of the time after five minutes of questioning.

Remarkably, part of his prediction came true. a PC bought in 2000 contained a little more than a billion Bits. However, during The Loebner Prize in 2000, an annual competition where Human Interrogators interact with Human Foils and Computers, the Human Interrogators could easily distinguish between them. Even though Turing’s timeline was slightly off, he got the trends exactly right. Chatbots gradually became more sophisticated over time.

🏆THE WINNER DOESN’T MATTER

ELIZA Computer Program, Explained.

Examples of machines deceiving humans stretch back to the 1960s, starting with a program vaguely resembling a modern chatbot. Launched in 1966 by Massachusetts Institute of Technology (MIT) Professor Joseph Weizenbaum, the ELIZA Computer Program simulated a conversation between a patient and their psychotherapist. Weizenbaum deliberately selected the psychotherapy context to avoid the challenge of equipping the computer program with extensive real-world knowledge. By reflecting on the patient’s statements, ELIZA could sustain a dialogue without requiring a deep understanding of factual information. Although this design aimed to highlight the superficial nature of communication between humans and machines, ELIZA often appeared intelligent enough to convince some patients it was human. In an interview, Weizenbaum recounted an instance where his secretary requested that he leave the room so she could have a genuine conversation with ELIZA. Reflecting on this, Weizenbaum expressed his surprise: “I hadn’t realized that even brief encounters with a relatively simple computer program could evoke strong delusional perceptions in otherwise normal individuals.”

ELIZA > GPT-3.5?

Not only did ELIZA’s conversational ability turn heads in the 1960s, but it also impressed AI researchers today. A group of cognitive scientists from the University of California San Diego (UCSD) recently evaluated GPT-3.5, GPT-4, and ELIZA in a version of “The Turing Test.”

After completing this version of “The Turing Test” 1,405 times:

GPT-3.5 tricked the Human Interrogator 20% of the time.
GPT-4 tricked the Human Interrogator 50% of the time.
ELIZA tricked the Human Interrogator 22% of the time.

So, how did a 60-year-old ELIZA beat GPT-3.5? That’s because ELIZA doesn’t exhibit the behaviors we associate with today’s LLMs. For instance, GPT-3.5 was fine-tuned to have a formal tone and not express opinions, which makes it appear less human.

Also, as chatbots become more advanced, so do our abilities to detect them. For this reason, “The Turing Test” can’t serve as a benchmark for comparing the performance of different AI models. However, by examining how and why Turing created it, we can ask better questions about the capabilities of GenAI.

⚙️HOW AND WHY WAS IT CREATED?

The Origin Story?

Despite popular belief, “The Turing Test” originally involved three participants:

A Man
A Woman
A Human Judge

The Human Judge attempts to determine who the Man and the Woman are based on written communication. Then, Turing replaced one of the participants with a Computer, shifting the question to whether the Human Judge could distinguish between a human and a Computer.

So, what exactly was Turing testing? Was the gender-guessing version of “The Turing Test” probing identity, deception, or performance? How do these objectives translate to the Computer version? Turing himself offered a clue:

“The original question ‘Can machines think?’ I believe it to be too meaningless to deserve discussion. Nevertheless, I believe that at the end of the century, the use of words and general educated opinions will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted.”

-Quote Source: Vectra AI/Sohrob Kazerounian/“Alan Turing: Pioneering Machine Intelligence and Modern Computing”

For Turing, it wasn’t meant to settle the debate over whether machines can “think.” The question is so vague and meaningless that it’s not worth debating. Instead, he predicted that people would eventually accept the idea of machines “thinking” without hesitation as societal norms evolve to accommodate this belief. Whether we’ve already reached that point is up for debate. We anthropomorphize chatbots all the time now. For example, we might say ChatGPT is smart, but does this mean ChatGPT can think? Turing might say it doesn’t matter because we treat chatbots like thinking entities.

Debates Show Historical Context?

To fully appreciate how “The Turing Test” came to be, it helps to view it through the lens of Turing’s debates with critics. In his historical analysis, “The Turing Test Argument,” Bernardo Gonçalves frames Turing’s proposal in its social, cultural, and historical context, reconstructing a debate in the 1940s between Turing and three critics: mathematician Douglas Hatreee, neurosurgeon Geoffrey Jefferson, and polymath Michael Polanyi.

Hartree argued that computers were merely calculation engines, incapable of creativity or spontaneity. Turing countered by emphasizing the potential of learning machines, or “unorganized machines,” to adapt and grow in ways that surpassed simple programming.

Jefferson was perhaps Turing’s most formidable critic, insisting that machines could only be considered intelligent if they could demonstrate creative expression. He famously declared that intelligence required not just producing a poem or sonnet but understanding that one had written it.

Polanyi argued that human intelligence relies on tacit knowledge, which refers to an intuitive, unformalized understanding that machines can’t replicate. Turing’s decision to focus on conversational abilities rather than rule-based activities like chess was likely influenced by this critique.

🔑KEY TAKEAWAY

Today’s modern adaptations of “The Turing Test” obscure its original purpose. Turing reframed the debate, shifting focus from whether machines can think to whether it matters if they do.

If chatbots can’t reliably pass “The Turing Test,” it means we’re learning faster than them. If chatbots can, it’s worth remembering that the real measure of their intelligence isn’t found in their ability to fool us. In the short term, it’s found in the value they provide. And in the long term, it’s found in the questions their success inspires us to ask about the nature of our creativity, intelligence, and consciousness.

📒FINAL NOTE

FEEDBACK

How would you rate today’s email?

It helps us improve the content for you!

❤️TAIP Review of The Week

“Can you explain Alan Turing’s contributions to AI? I’m a big fan of the newsletter!”

-Lucas (1️⃣ 👍Nailed it!)

REFER & EARN

🎉Your Friends Learn, You Earn!

You currently have 0 referrals, only 1 away from receiving 🎓3 Simple Steps to Turn ChatGPT Into an Instant Expert.

Refer 3 friends to learn how to 👷‍♀️Build Custom Versions of OpenAI’s ChatGPT.

Copy and paste this link to friends: https://theaipulse.beehiiv.com/subscribe?ref=PLACEHOLDER