• The AI Pulse
  • Posts
  • šŸ§  LLMs Redefine How Humans Interact With Computers

šŸ§  LLMs Redefine How Humans Interact With Computers

PLUS: Why Certain AI Models Are More Like Humans Than Software

Welcome back AI prodigies!

In todayā€™s Sunday Special:

  • šŸ“œThe Prelude

  • šŸ’¬LLMs Arenā€™t Exactly Software

  • šŸ¤”But Theyā€™re Kind of Like People

  • šŸ”‘Key Takeaway

Read Time: 7 minutes

šŸŽ“Key Terms

  • Anthropomorphism: Using human traits, emotions, or intentions to describe non-human things.

  • Hallucinations: When LLMs present false information as fact, often in a confident or matter-of-fact tone.

  • Large Language Models (LLMs): AI models pre-trained on vast amounts of data to generate human-like text.

  • LLM-Modulo: A framework that combines LLMs with external verifiers to check the accuracy of an LLMā€™s responses to user queries.

  • Chain-of-Thought (CoT): A technique that encourages LLMs to explain their reasoning by breaking down complex tasks into manageable steps.

  • System Prompts: A set of instructions, guidelines, and contextual information provided to AI models before they engage with user queries.

šŸ©ŗ PULSE CHECK

When interacting with AI, should we treat it like a human?

Vote Below to View Live Results

Login or Subscribe to participate in polls.

šŸ“œTHE PRELUDE

Weā€™ve all spoken to voice assistants. Whether you chatted with Appleā€™s Siri, Samsungā€™s Bixby, or Amazonā€™s Alexa, your mood probably determined your tone. You were polite and gentle at your best, and you might have cursed at these devices at your worst. Though asking Amazonā€™s Alexa to turn on the lights seems trivial, it raises questions about how we should treat AI. To explore these questions, weā€™ll narrow our focus to LLMs, as theyā€™re the most widely used AI application.

Whether AI will ever be equivalent to human consciousness involves not just the technical capabilities of AI but also our perceptions of what it means to be human and conscious. We can easily attribute human features to almost anything without it resembling a human. For example, we often personify our pets by naming them, attributing emotions to them, and talking to them as if they understand. Iā€™m still waiting for someone to invent a dog translator!šŸ¤£

Humans are prone to Anthropomorphism, especially with LLMs, since going back and forth with them feels like talking to someone. Yet some observers, like cognitive scientist Gary Marcus of New York University (NYU), warn against attributing human-like characteristics to AI applications. According to Marcus, we risk overestimating AIā€™s intellect, sentience, and companionship. We believe the anthropomorphization of LLMs is inevitable. So, we must learn what weā€™re anthropomorphizing to mitigate the risks and foresee the implications.

šŸ’¬LLMs ARENā€™T EXACTLY SOFTWARE

Because AI, as a technical term, is intimidating, many people think itā€™s a tool made by programmers for programmers. As a result, Information Technology (IT) departments often lead corporate AI strategies, and people look to computer scientists to forecast the implications of AI. Though programmers use LLMs to debug or autocomplete their code, the usefulness of LLMs isnā€™t bound by the tasks of their creators. In other words, LLMs can be used for tasks their creators didnā€™t intend. The number of LLM use cases is only limited to the number of tasks involving human language.

And although LLMs are made of software, they donā€™t function like most software applications. Theyā€™re probabilistic and, therefore, unpredictable. Unlike the ā€œCheck Outā€ button on a retail website that takes you to the payment page, LLMs often produce different outputs (i.e., answers) given the same input (i.e., question). Though LLMs canā€™t quite think, their language simulations out-invent most humans. LLMs can produce combinations of one or more types of content, such as sentences, pictures, sounds, or videos, that never existed in seconds.

In fact, some studies perceive them as more empathetic and accurate than human doctors. In other studies, they surpassed the average human IQ level on the Norway Mensa IQ Test: an online exam that requires you to solve 35 visual pattern puzzles within 25 minutes. The visual pattern puzzles get progressively more complex, and you earn points for each correct answer. OpenAIā€™s ā€œOpenAI o1-previewā€ correctly solved 25 out of 35 visual pattern puzzles on a version of the Norway Mensa IQ Test that contained new unpublished problems. For context, an IQ of 120 is considered above average and is in the top 10% of the human population.

Yet, they also have severe limitations, like an inability to generalize their knowledge to new, unseen tasks. In narrow, structured assessments, LLMs are quick, high-volume brainstormers. But an LLMs ā€œreasoningā€ only reflects whatā€™s been done and digitized in the past and documented for the future. The best example of this is basic arithmetic. Even after being fine-tuned on a vast dataset to solve three-digit multiplication, LLMs failed to solve five-digit multiplication. This example suggests that while LLMs can perform well on familiar tasks, they may lack the ability to truly understand the underlying principles and apply them to novel situations. You might say that current LLMs like OpenAIā€™s GPT-4o (ā€œoā€ for ā€œomniā€) can do that correctly, and youā€™d be right. They may appear capable of complex tasks like five-digit multiplication. However, their underlying mechanism relies on external tools like calculators or pre-programmed algorithms within an LLM-Modulo framework, where additional computational resources augment the LLMā€™s capabilities.

LLMs are inconsistent language generators who canā€™t reason, but they help humans with various tasks. Weā€™re not working with another piece of software. At the same time, weā€™re also clearly not texting back and forth with a human. So, whatā€™s the deal?

šŸ¤”BUT THEYā€™RE KIND OF LIKE PEOPLE

Though LLMs arenā€™t humans, they excel at human-centric tasks like writing and empathy while struggling with traditionally machine-friendly tasks like repeating a process consistently or performing complex mathematical calculations. Instead, they solve machine-friendly problems very humanly. If you ask OpenAIā€™s ChatGPT to perform data analysis of a spreadsheet, it doesnā€™t innately understand the numbers. Instead, it leverages tools like we do, glancing at the spreadsheet and then writing Python (i.e., a programming language) to perform the analysis. Even its flaws, such as occasional laziness, making up information, and false confidence in wrong answers, resemble human errors more than machine errors.

This quasi-human quality of LLMs makes them receptive to prompting techniques like telling AI who itā€™ll become or asking AI to provide step-by-step instructions. Defining who the AI is and its specific objectives will contextualize the conversation. For example, telling it to ā€œact as a strategic, patient tutorā€ will create a better learning experience. Additionally, Chain-of-Thought (CoT) prompting, where you ask the AI to ā€œthink step-by-step,ā€ results in better quality answers but also lets us better understand how the AIā€™s ā€œthinkingā€ progressed to generate an answer.

When developers integrate AI applications into consumer products, consumers expect them to behave like software, meaning it should do precisely what they expect. That means if an AI application performs a task correctly 90% of the time, itā€™s unreliable. 100% accuracy is almost impossible to achieve with statistical learning-based AI applications like LLMs.

With this in mind, we might become more comfortable with Hallucinations if we give LLMs human-like personalities. As end-users, we arenā€™t used to software making errors but expect errors from our human peers. Giving AI a human-like personality could help us differentiate between mass-market, generalist LLMs with similar raw capabilities. For example, many people gravitate towards Anthropic Claudeā€™s emotion-filled answers. In Claudeā€™s case, this ā€œpersonalityā€ is intentional. In a post on X, Anthropicā€™s Lead Ethicist, Amanda Askell, revealed Claude 3ā€™s System Prompts. Hereā€™s an excerpt from the instructions Claude 3 received during the training process:

ā€œClaude should respond concisely to very simple questions but provide more thorough reasoning to more complex, open-ended questions. If asked about controversial topics, Claude should try to offer careful thoughts and objective information without downplaying its harmful content...Claude doesnā€™t engage in stereotyping, including the negative stereotyping of majority groups.ā€

-X/@AmandaAskell/ā€œHere is Claude 3ā€™s system prompt!ā€

These instructions predispose Claude 3 to certain kinds of text generation. The question of whether developers should impose human-like personalities on conversational chatbots is a practical reality that we must address.

šŸ”‘KEY TAKEAWAY

Anthropomorphizing AI is no longer a theoretical discussion. Not only are developers telling conversational chatbots how to act, but conversational chatbots now have longer ā€œmemoriesā€ across multiple conversations and new features like voice mode.

Character.AI, which offers superintelligent chatbots that hear you, understand you, and remember you, is the second most used AI site after OpenAIā€™s ChatGPT. If human-AI interaction is closer to human-human interaction than human-software interaction, it will birth a new set of written and unwritten social practices. Because these practices will develop through billions of human-AI interactions across thousands of tools, billions of users, and hundreds of cultures, no single interaction will feel consequential. But each interaction will bring us one step closer to a new shared social reality.

šŸ“’FINAL NOTE

FEEDBACK

How would you rate todayā€™s email?

It helps us improve the content for you!

Login or Subscribe to participate in polls.

ā¤ļøTAIP Review of The Week

ā€œCan you guys address if LLMs = Humans? Huge fan!ā€

-Kyan (1ļøāƒ£ šŸ‘Nailed it!)
REFER & EARN

šŸŽ‰Your Friends Learn, You Earn!

You currently have 0 referrals, only 1 away from receiving āš™ļøUltimate Prompt Engineering Guide.

Refer 3 friends to learn how to šŸ‘·ā€ā™€ļøBuild Custom Versions of OpenAIā€™s ChatGPT.

Reply

or to participate.