- The AI Pulse
- Posts
- š§ The Turing Test: Timeless and Thought-Provoking
š§ The Turing Test: Timeless and Thought-Provoking
PLUS: How a Chatbot From 1966 Beat OpenAIās GPT-3.5
Welcome back AI prodigies!
In todayās Sunday Special:
š¦How Does It Work?
šThe Winner Doesnāt Matter
āļøHow and Why Was It Created?
šKey Takeaway
Read Time: 7 minutes
šKey Terms
Binary Digit (i.e., āBitā): The smallest unit of data that a computer can process.
Generative AI (GenAI): Uses AI models trained on text, image, audio, video, or code to generate new content.
Large Language Models (LLMs): AI models pre-trained on vast amounts of data to generate human-like text.
Anthropomorphize: To give human traits, emotions, or behaviors to non-human things, like animals, objects, or machines.
š©ŗ PULSE CHECK
Could you tell the difference between a human and a chatbot?Vote Below to View Live Results |
š¦HOW DOES IT WORK?
āThe Turing Test,ā Explained.
This summer, OpenAIās ChatGPT passed āThe Turing Test.ā British mathematician Alan Turing developed āThe Turing Testā in 1950 to measure a computerās ability to exhibit human-like intelligence.
āThe Turing Testā had three participants:
A Computer
A Human Foil
A Human Interrogator
The Human Interrogator attempts to determine which is the Computer and the Human Foil by asking a series of questions through a keyboard. If the Computer can consistently fool the Human Interrogator, itās considered an intelligent, thinking entity.
āWe werenāt even close to passing it in 2021. Then, OpenAIās ChatGPT passed it,ā said former PayPal CEO Peter Thiel. āThat was the Holy Grail of AI research for the previous 60 years.ā
How Do You Pass It?
But whoās the human interrogator? Whatās required to pass it? Turing never answered these questions because āThe Turing Testā wasnāt meant to be a benchmark for comparing the performance of different AI models.
However, in his landmark study in 1950 titled āComputing Machinery and Intelligence,ā he predicted that by 2000, machines with a capacity of 1 billion Bits of memory would beat a Human Interrogator 30% of the time after five minutes of questioning.
Remarkably, part of his prediction came true. a PC bought in 2000 contained a little more than a billion Bits. However, during The Loebner Prize in 2000, an annual competition where Human Interrogators interact with Human Foils and Computers, the Human Interrogators could easily distinguish between them. Even though Turingās timeline was slightly off, he got the trends exactly right. Chatbots gradually became more sophisticated over time.
šTHE WINNER DOESNāT MATTER
ELIZA Computer Program, Explained.
Examples of machines deceiving humans stretch back to the 1960s, starting with a program vaguely resembling a modern chatbot. Launched in 1966 by Massachusetts Institute of Technology (MIT) Professor Joseph Weizenbaum, the ELIZA Computer Program simulated a conversation between a patient and their psychotherapist. Weizenbaum deliberately selected the psychotherapy context to avoid the challenge of equipping the computer program with extensive real-world knowledge. By reflecting on the patientās statements, ELIZA could sustain a dialogue without requiring a deep understanding of factual information. Although this design aimed to highlight the superficial nature of communication between humans and machines, ELIZA often appeared intelligent enough to convince some patients it was human. In an interview, Weizenbaum recounted an instance where his secretary requested that he leave the room so she could have a genuine conversation with ELIZA. Reflecting on this, Weizenbaum expressed his surprise: āI hadnāt realized that even brief encounters with a relatively simple computer program could evoke strong delusional perceptions in otherwise normal individuals.ā
ELIZA > GPT-3.5?
Not only did ELIZAās conversational ability turn heads in the 1960s, but it also impressed AI researchers today. A group of cognitive scientists from the University of California San Diego (UCSD) recently evaluated GPT-3.5, GPT-4, and ELIZA in a version of āThe Turing Test.ā
After completing this version of āThe Turing Testā 1,405 times:
GPT-3.5 tricked the Human Interrogator 20% of the time.
GPT-4 tricked the Human Interrogator 50% of the time.
ELIZA tricked the Human Interrogator 22% of the time.
So, how did a 60-year-old ELIZA beat GPT-3.5? Thatās because ELIZA doesnāt exhibit the behaviors we associate with todayās LLMs. For instance, GPT-3.5 was fine-tuned to have a formal tone and not express opinions, which makes it appear less human.
Also, as chatbots become more advanced, so do our abilities to detect them. For this reason, āThe Turing Testā canāt serve as a benchmark for comparing the performance of different AI models. However, by examining how and why Turing created it, we can ask better questions about the capabilities of GenAI.
āļøHOW AND WHY WAS IT CREATED?
The Origin Story?
Despite popular belief, āThe Turing Testā originally involved three participants:
A Man
A Woman
A Human Judge
The Human Judge attempts to determine who the Man and the Woman are based on written communication. Then, Turing replaced one of the participants with a Computer, shifting the question to whether the Human Judge could distinguish between a human and a Computer.
So, what exactly was Turing testing? Was the gender-guessing version of āThe Turing Testā probing identity, deception, or performance? How do these objectives translate to the Computer version? Turing himself offered a clue:
āThe original question āCan machines think?ā I believe it to be too meaningless to deserve discussion. Nevertheless, I believe that at the end of the century, the use of words and general educated opinions will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted.ā
For Turing, it wasnāt meant to settle the debate over whether machines can āthink.ā The question is so vague and meaningless that itās not worth debating. Instead, he predicted that people would eventually accept the idea of machines āthinkingā without hesitation as societal norms evolve to accommodate this belief. Whether weāve already reached that point is up for debate. We anthropomorphize chatbots all the time now. For example, we might say ChatGPT is smart, but does this mean ChatGPT can think? Turing might say it doesnāt matter because we treat chatbots like thinking entities.
Debates Show Historical Context?
To fully appreciate how āThe Turing Testā came to be, it helps to view it through the lens of Turingās debates with critics. In his historical analysis, āThe Turing Test Argument,ā Bernardo GonƧalves frames Turingās proposal in its social, cultural, and historical context, reconstructing a debate in the 1940s between Turing and three critics: mathematician Douglas Hatreee, neurosurgeon Geoffrey Jefferson, and polymath Michael Polanyi.
Hartree argued that computers were merely calculation engines, incapable of creativity or spontaneity. Turing countered by emphasizing the potential of learning machines, or āunorganized machines,ā to adapt and grow in ways that surpassed simple programming.
Jefferson was perhaps Turingās most formidable critic, insisting that machines could only be considered intelligent if they could demonstrate creative expression. He famously declared that intelligence required not just producing a poem or sonnet but understanding that one had written it.
Polanyi argued that human intelligence relies on tacit knowledge, which refers to an intuitive, unformalized understanding that machines canāt replicate. Turingās decision to focus on conversational abilities rather than rule-based activities like chess was likely influenced by this critique.
šKEY TAKEAWAY
Todayās modern adaptations of āThe Turing Testā obscure its original purpose. Turing reframed the debate, shifting focus from whether machines can think to whether it matters if they do.
If chatbots canāt reliably pass āThe Turing Test,ā it means weāre learning faster than them. If chatbots can, itās worth remembering that the real measure of their intelligence isnāt found in their ability to fool us. In the short term, itās found in the value they provide. And in the long term, itās found in the questions their success inspires us to ask about the nature of our creativity, intelligence, and consciousness.
šFINAL NOTE
FEEDBACK
How would you rate todayās email?It helps us improve the content for you! |
ā¤ļøTAIP Review of The Week
āCan you explain Alan Turingās contributions to AI? Iām a big fan of the newsletter!ā
REFER & EARN
šYour Friends Learn, You Earn!
You currently have 0 referrals, only 1 away from receiving āļøUltimate Prompt Engineering Guide.
Refer 3 friends to learn how to š·āāļøBuild Custom Versions of OpenAIās ChatGPT.
Copy and paste this link to friends: https://theaipulse.beehiiv.com/subscribe?ref=PLACEHOLDER
Reply