- The AI Pulse
- Posts
- š§ LLMs Redefine How Humans Interact With Computers
š§ LLMs Redefine How Humans Interact With Computers
PLUS: Why Certain AI Models Are More Like Humans Than Software
Welcome back AI prodigies!
In todayās Sunday Special:
šThe Prelude
š¬LLMs Arenāt Exactly Software
š¤But Theyāre Kind of Like People
šKey Takeaway
Read Time: 7 minutes
šKey Terms
Anthropomorphism: Using human traits, emotions, or intentions to describe non-human things.
Hallucinations: When LLMs present false information as fact, often in a confident or matter-of-fact tone.
Large Language Models (LLMs): AI models pre-trained on vast amounts of data to generate human-like text.
LLM-Modulo: A framework that combines LLMs with external verifiers to check the accuracy of an LLMās responses to user queries.
Chain-of-Thought (CoT): A technique that encourages LLMs to explain their reasoning by breaking down complex tasks into manageable steps.
System Prompts: A set of instructions, guidelines, and contextual information provided to AI models before they engage with user queries.
š©ŗ PULSE CHECK
When interacting with AI, should we treat it like a human?Vote Below to View Live Results |
šTHE PRELUDE
Weāve all spoken to voice assistants. Whether you chatted with Appleās Siri, Samsungās Bixby, or Amazonās Alexa, your mood probably determined your tone. You were polite and gentle at your best, and you might have cursed at these devices at your worst. Though asking Amazonās Alexa to turn on the lights seems trivial, it raises questions about how we should treat AI. To explore these questions, weāll narrow our focus to LLMs, as theyāre the most widely used AI application.
Whether AI will ever be equivalent to human consciousness involves not just the technical capabilities of AI but also our perceptions of what it means to be human and conscious. We can easily attribute human features to almost anything without it resembling a human. For example, we often personify our pets by naming them, attributing emotions to them, and talking to them as if they understand. Iām still waiting for someone to invent a dog translator!š¤£
Humans are prone to Anthropomorphism, especially with LLMs, since going back and forth with them feels like talking to someone. Yet some observers, like cognitive scientist Gary Marcus of New York University (NYU), warn against attributing human-like characteristics to AI applications. According to Marcus, we risk overestimating AIās intellect, sentience, and companionship. We believe the anthropomorphization of LLMs is inevitable. So, we must learn what weāre anthropomorphizing to mitigate the risks and foresee the implications.
š¬LLMs ARENāT EXACTLY SOFTWARE
Because AI, as a technical term, is intimidating, many people think itās a tool made by programmers for programmers. As a result, Information Technology (IT) departments often lead corporate AI strategies, and people look to computer scientists to forecast the implications of AI. Though programmers use LLMs to debug or autocomplete their code, the usefulness of LLMs isnāt bound by the tasks of their creators. In other words, LLMs can be used for tasks their creators didnāt intend. The number of LLM use cases is only limited to the number of tasks involving human language.
And although LLMs are made of software, they donāt function like most software applications. Theyāre probabilistic and, therefore, unpredictable. Unlike the āCheck Outā button on a retail website that takes you to the payment page, LLMs often produce different outputs (i.e., answers) given the same input (i.e., question). Though LLMs canāt quite think, their language simulations out-invent most humans. LLMs can produce combinations of one or more types of content, such as sentences, pictures, sounds, or videos, that never existed in seconds.
In fact, some studies perceive them as more empathetic and accurate than human doctors. In other studies, they surpassed the average human IQ level on the Norway Mensa IQ Test: an online exam that requires you to solve 35 visual pattern puzzles within 25 minutes. The visual pattern puzzles get progressively more complex, and you earn points for each correct answer. OpenAIās āOpenAI o1-previewā correctly solved 25 out of 35 visual pattern puzzles on a version of the Norway Mensa IQ Test that contained new unpublished problems. For context, an IQ of 120 is considered above average and is in the top 10% of the human population.
Yet, they also have severe limitations, like an inability to generalize their knowledge to new, unseen tasks. In narrow, structured assessments, LLMs are quick, high-volume brainstormers. But an LLMs āreasoningā only reflects whatās been done and digitized in the past and documented for the future. The best example of this is basic arithmetic. Even after being fine-tuned on a vast dataset to solve three-digit multiplication, LLMs failed to solve five-digit multiplication. This example suggests that while LLMs can perform well on familiar tasks, they may lack the ability to truly understand the underlying principles and apply them to novel situations. You might say that current LLMs like OpenAIās GPT-4o (āoā for āomniā) can do that correctly, and youād be right. They may appear capable of complex tasks like five-digit multiplication. However, their underlying mechanism relies on external tools like calculators or pre-programmed algorithms within an LLM-Modulo framework, where additional computational resources augment the LLMās capabilities.
LLMs are inconsistent language generators who canāt reason, but they help humans with various tasks. Weāre not working with another piece of software. At the same time, weāre also clearly not texting back and forth with a human. So, whatās the deal?
š¤BUT THEYāRE KIND OF LIKE PEOPLE
Though LLMs arenāt humans, they excel at human-centric tasks like writing and empathy while struggling with traditionally machine-friendly tasks like repeating a process consistently or performing complex mathematical calculations. Instead, they solve machine-friendly problems very humanly. If you ask OpenAIās ChatGPT to perform data analysis of a spreadsheet, it doesnāt innately understand the numbers. Instead, it leverages tools like we do, glancing at the spreadsheet and then writing Python (i.e., a programming language) to perform the analysis. Even its flaws, such as occasional laziness, making up information, and false confidence in wrong answers, resemble human errors more than machine errors.
This quasi-human quality of LLMs makes them receptive to prompting techniques like telling AI who itāll become or asking AI to provide step-by-step instructions. Defining who the AI is and its specific objectives will contextualize the conversation. For example, telling it to āact as a strategic, patient tutorā will create a better learning experience. Additionally, Chain-of-Thought (CoT) prompting, where you ask the AI to āthink step-by-step,ā results in better quality answers but also lets us better understand how the AIās āthinkingā progressed to generate an answer.
When developers integrate AI applications into consumer products, consumers expect them to behave like software, meaning it should do precisely what they expect. That means if an AI application performs a task correctly 90% of the time, itās unreliable. 100% accuracy is almost impossible to achieve with statistical learning-based AI applications like LLMs.
With this in mind, we might become more comfortable with Hallucinations if we give LLMs human-like personalities. As end-users, we arenāt used to software making errors but expect errors from our human peers. Giving AI a human-like personality could help us differentiate between mass-market, generalist LLMs with similar raw capabilities. For example, many people gravitate towards Anthropic Claudeās emotion-filled answers. In Claudeās case, this āpersonalityā is intentional. In a post on X, Anthropicās Lead Ethicist, Amanda Askell, revealed Claude 3ās System Prompts. Hereās an excerpt from the instructions Claude 3 received during the training process:
āClaude should respond concisely to very simple questions but provide more thorough reasoning to more complex, open-ended questions. If asked about controversial topics, Claude should try to offer careful thoughts and objective information without downplaying its harmful content...Claude doesnāt engage in stereotyping, including the negative stereotyping of majority groups.ā
These instructions predispose Claude 3 to certain kinds of text generation. The question of whether developers should impose human-like personalities on conversational chatbots is a practical reality that we must address.
šKEY TAKEAWAY
Anthropomorphizing AI is no longer a theoretical discussion. Not only are developers telling conversational chatbots how to act, but conversational chatbots now have longer āmemoriesā across multiple conversations and new features like voice mode.
Character.AI, which offers superintelligent chatbots that hear you, understand you, and remember you, is the second most used AI site after OpenAIās ChatGPT. If human-AI interaction is closer to human-human interaction than human-software interaction, it will birth a new set of written and unwritten social practices. Because these practices will develop through billions of human-AI interactions across thousands of tools, billions of users, and hundreds of cultures, no single interaction will feel consequential. But each interaction will bring us one step closer to a new shared social reality.
šFINAL NOTE
FEEDBACK
How would you rate todayās email?It helps us improve the content for you! |
ā¤ļøTAIP Review of The Week
āCan you guys address if LLMs = Humans? Huge fan!ā
REFER & EARN
šYour Friends Learn, You Earn!
You currently have 0 referrals, only 1 away from receiving āļøUltimate Prompt Engineering Guide.
Refer 3 friends to learn how to š·āāļøBuild Custom Versions of OpenAIās ChatGPT.
Copy and paste this link to friends: https://theaipulse.beehiiv.com/subscribe?ref=PLACEHOLDER
Reply