• The AI Pulse
  • Posts
  • šŸ§  Does AI Grasp Meaning or Simply Replicate It?

šŸ§  Does AI Grasp Meaning or Simply Replicate It?

PLUS: Do Hallucinations Actually Matter?

Welcome back AI prodigies!

In todayā€™s Sunday Special:

  • šŸ“œThe Prelude

  • šŸ¤–Multimodal AI Models, Explained.

  • šŸ’­How Much Do Todayā€™s Chatbots Hallucinate?

  • āš™ļøDo Hallucinations Actually Matter?

  • šŸ”‘Key Takeaway

Read Time: 7 minutes

šŸŽ“Key Terms

  • Transformer Architecture: Allows AI models to analyze different parts of an input simultaneously.

  • Multimodal AI Models: AI models that can process, analyze, and generate various types of data.

  • Large Language Models (LLMs): AI models pre-trained on vast amounts of data to generate human-like text.

  • Generative AI (GenAI): Uses AI models trained on text, images, audio, video, or code data to generate new content.

  • Convolutional Neural Networks (CNNs): A network of specialized layers that detect visual patterns in images, such as edges, textures, or structures.

šŸ©ŗ PULSE CHECK

Will AI ever be capable of understanding things like we do?

Vote Below to View Live Results

Login or Subscribe to participate in polls.

šŸ“œTHE PRELUDE

As GenAI continues to make remarkable strides, a fundamental question arises. Does GenAI genuinely understand the content it processes, or does it simply simulate understanding?

American philosopher John Searle attempted to answer this question in 1980 when studying Minds, Brains, and Programs. He argued that computers lack fundamental understanding by creating a thought experiment called The Chinese Room Argument:

Imagine youā€™re inside a room with a door. Under the door, youā€™re being fed slips of paper with mysterious symbols. You donā€™t know what these mysterious symbols mean, but in the middle of the room, a manual provides instructions to manipulate them. So, you use the manual to manipulate the mysterious symbols and feed them back under the door. Unbeknownst to you, the mysterious symbols convey questions in Chinese, and after manipulating those mysterious symbols and feeding them back under the door, you provided coherent answers to those questions.

Searle argues that even though the person inside the room can provide coherent answers to the mysterious symbols that convey questions in Chinese, it doesnā€™t mean they actually understand Chinese. Theyā€™re just manipulating the mysterious symbols according to a manual.

He concluded that computers, like the person inside the room, lack genuine understanding because they only process Syntax (i.e., the rules for constructing grammatically correct sentences) without Semantics (i.e., understanding what those sentences mean).

Now, 44 years after Searleā€™s thought experiment, conversational chatbots like OpenAIā€™s ChatGPT are becoming so human-like that weā€™re asking ourselves: Does GenAI truly demonstrate Semantics, or is it merely the result of exceptional pattern recognition?

šŸ¤–MULTIMODAL AI MODELS, EXPLAINED.

Multimodal AI Models, like Anthropicā€™s Claude 3.5 Sonnet, handle multiple modalities of data (e.g., text, images, audio, video, or code) to generate various responses to user queries (i.e., prompts). To achieve this, Multimodal AI Models represent each modality as a sequence of numbers. For instance, when performing tasks like adding captions to images, Multimodal AI Models use CNNs to extract features from the images by turning them into Vectors (i.e., an ordered list of numbers). Vectors represent the images in a way that Multimodal AI Models can understand. Next, they rely on a Transformer Architecture to process the text by converting words into Vectors. Lastly, they deploy Cross-Modal Attention, which aligns Vectors from both the images and the text to enable Multimodal AI Models to ā€œunderstandā€ how the images and captions relate. For instance, it helps Multimodal AI Models learn that certain visual features (e.g., a furry four-legged shape) are associated with specific words (e.g., dog).

Although Multimodal AI Models can generate content, at times more reliably than actual humans, they technically canā€™t understand the content. Understanding is the ability to perceive the intended meaning of words, and perception requires being aware or conscious of what words represent. As far as we can tell, conversational chatbots arenā€™t conscious. Nevertheless, their performance has rapidly improved each month as AI firms curate more data, implement more training, and acquire more computing power. Despite mind-boggling improvements across key metrics, Multimodal AI Models still Hallucinate by presenting false information as fact, often in a confident or matter-of-fact tone. So, how do we measure this? Does it even matter?

šŸ’­HOW MUCH DO TODAYā€™S CHATBOTS HALLUCINATE?

Vectara, the AI Agent platform for enterprises, developed ā€œFaithBench,ā€ a benchmark the evaluates the Hallucination tendencies of LLMs. It categorizes Hallucinations into three sections:

  1. Questionable: They arenā€™t clearly Hallucinations. In other words, thereā€™s room for debate.

  2. Benign: While technically incorrect, these Hallucinations are supported by common sense or logical reasoning.

  3. Unwanted: Clear Hallucinations that directly contradict the facts (i.e., Intrinsic) or completely create new information (i.e., Extrinsic).

ā€œFaithBenchā€ allows developers and researchers to better understand how LLMs produce Hallucinations, which helps them develop more targeted strategies to mitigate them.

GPT-4o (ā€œoā€ for ā€œomniā€) and GPT-3.5 Turbo were each tasked with summarizing 66 passages about various topics. 11 human annotators judged the validity of 660 AI-generated summaries by leveraging ā€œFaithBench.ā€ Ultimately, just 36% of the summaries generated by these LLMs were objectively true, underscoring the significant challenge of Hallucinations in even the most advanced conversational chatbots.

āš™ļøDO HALLUCINATIONS ACTUALLY MATTER?

Letā€™s assume that Hallucinations disappear tomorrow. GenAI still would not understand language in a human-like way. Our language is deeply contextual, subjective, and socially constructed, requiring socialization and common sense through lived experience. GenAI lacks this because it doesnā€™t interact with the world in a way that builds an intuitive grasp of concepts. Language isnā€™t just about words; itā€™s about how those words are used in a specific situation. The literal meaning of a word can change dramatically depending on the context. GenAI canā€™t resolve ambiguity in the same way humans do.

Since GenAI does not and cannot understand language in a human-like way, worrying about Hallucinations presumes an expectation of reliability that was never achievable in the first place. We didnā€™t design conversational chatbots to regurgitate facts.

After all, GenAI refers to Generative AI, not Repetitive AI. Hallucinations are often creative extrapolations rather than strict errors. If LLMs were purely deterministic fact machines, they would lose the flexibility that makes them powerful AI-enabled tools for brainstorming, summarizing, and generating content. Humans donā€™t expect anyone or anything to be 100% accurate all the time. GenAI should be held to the same standard.

Although LLMs are made of software, they donā€™t function like most software. Theyā€™re probabilistic and, therefore, unpredictable. Unlike the ā€œCheck Outā€ button on a retail website, LLMs produce different responses (i.e., outputs) when given the same questions (i.e., inputs).

šŸ”‘KEY TAKEAWAY

Multimodal AI Models are undeniably impressive. Yet, at their core, they remain sophisticated pattern predictors, not true thinkers. They can perform remarkably well on various tasks, often mimicking or exceeding human-like abilities. However, this performance doesnā€™t necessarily imply understanding or genuine competence in a way that humans possess. While reducing Hallucinations may improve reliability, it doesnā€™t solve this issue. The goal of GenAI has never been absolute truth; it seeks to be useful.

šŸ“’FINAL NOTE

FEEDBACK

How would you rate todayā€™s email?

It helps us improve the content for you!

Login or Subscribe to participate in polls.

ā¤ļøTAIP Review of The Week

ā€œThe best daily newsletter Iā€™ve ever received. Great work!ā€

-Tom (1ļøāƒ£ šŸ‘Nailed it!)
REFER & EARN

šŸŽ‰Your Friends Learn, You Earn!

You currently have 0 referrals, only 1 away from receiving āš™ļøUltimate Prompt Engineering Guide.

Reply

or to participate.