• The AI Pulse
  • Posts
  • 🧠 Can AI Really Think? The Truth About LLMs.

🧠 Can AI Really Think? The Truth About LLMs.

PLUS: What’s Abductive Reasoning, and Why Can’t AI Do It?

Welcome back AI prodigies!

In today’s Sunday Special:

  • šŸ“œThe Prelude

  • šŸ’­What Is Reasoning?

  • šŸ’¬Can LLMs Reason?

  • šŸ¤–Can LRMs Reason?

  • šŸ”‘Key Takeaway

Read Time: 7 minutes

šŸŽ“Key Terms

  • Large Language Models (LLMs): AI Models pre-trained on vast amounts of data to generate human-like text.

  • Large Reasoning Models (LRMs): AI Models designed to mimic a human’s decision-making abilities to solve complex, multi-step problems.

🩺 PULSE CHECK

Can AI reason at all?

Vote Below to View Live Results

Login or Subscribe to participate in polls.

šŸ“œTHE PRELUDE

Consider this simple logic puzzle: ā€œJared has two brothers and two sisters. How many siblings does his sister Jenny have?ā€

If you said ā€œfour,ā€ you’re right! Most of us solve this type of question instantly without a second thought. But conversational chatbots struggle with logic puzzles like this one.

What’s causing them to struggle? The problem lies in their limited ability to reason. While conversational chatbots are great at generating human-like text, they don’t truly understand the logic behind what they’re generating.

So, what exactly is reasoning? How do LLMs work? How well do they reason? And can LRMs do any better?

šŸ’­WHAT IS REASONING?

Philosophers divide Reasoning into three categories: Deductive, Inductive, and Abductive.

During the 4th century BC, ancient Greek philosopher Aristotle conceived of Deductive Reasoning and Inductive Reasoning in the Organon, a collection of six works on logical analysis.

During the late 19th century, American mathematician Charles Sanders Peirce defined a new logical process known as Abductive Reasoning.

Here’s what makes each type of Reasoning distinct:

  1. Deductive Reasoning: The process of deriving specific conclusions from general premises. If all the general premises are true, then the specific conclusion must also be true. For example, ā€œAll mammals are warm-blooded; all whales are mammals; therefore, all whales are warm-blooded.ā€

  2. Inductive Reasoning: The process of forming probable conclusions based on repeated observations. For example, the sun rising every day is something we’ve always observed, so we expect it to rise again tomorrow. But technically, we can’t be 100% sure because it’s based on repeated observations, not absolute proof.

  3. Abductive Reasoning: The process of starting with an observation and seeking the most plausible explanation. For example, if you notice your lawn is wet, you might conclude that it rained last night. In other words, Abductive Reasoning pinpoints the most likely causes of what you observe.

We often combine different forms of Reasoning to solve everyday problems. For example, scientists use Abductive Reasoning to generate hypotheses that explain observations. Then, they employ Deductive Reasoning to derive testable experiments from those hypotheses. Next, they rely on Inductive Reasoning to generalize results from those testable experiments into broader theories.

So, where do LLMs fail in the landscape of Reasoning?

šŸ’¬CAN LLMs REASON?

⦿ 1ļøāƒ£ 🦾How Do LLMs Work?

An LLM is a sophisticated autocomplete machine trained on the entire Internet.

To train an LLM, developers essentially show it millions of sentences with the last word covered up (i.e., ā€œThe fat cat sat on the {BLANK}.ā€) and have it guess what comes next.

Each time the LLM guesses wrong, it adjusts thousands of Weights, which are numerical values that help it decide which words or patterns are most important for making better guesses in the future.

In simple terms, Weights control how tens of thousands of words relate to each other within an LLM. These relationships help form the Neural Network (NN): a highly interdependent framework that processes all the words using two methods:

  1. Attention Mechanisms calculate how much each word in a sentence should ā€œpay attentionā€ to every other word. Consider the following sentence: ā€œMiami, coined the ā€˜Magic City,’ has beautiful white-sand beaches.ā€ In this case, the words ā€œMiamiā€ and ā€œbeachesā€ would pay more attention to each other because they’re closely related.

  2. Transformer Layers help further clarify the meaning of each word within a sentence. This process helps the LLM develop a deeper understanding of the context. Consider the following sentence: ā€œThe cat chased the mouse.ā€ In this case, it looks at the word ā€œchasedā€ and determines that ā€œcatā€ is important because it’s doing the chasing. It also determines that ā€œmouseā€ is important because it’s being chased. So, it understands that ā€œchasedā€ is connected to ā€œcatā€ and ā€œmouse.ā€

⦿ 2ļøāƒ£ 🧠 Reasoning Capabilities?

While LLMs excel at generating human-like text, their ability to Reason is fundamentally different from ours.

Here’s how they perform across the three distinct types of Reasoning:

  1. āŒ Deductive Reasoning {Simulated}: When high-quality training datasets contain explicit logical structures (e.g., if P→Q and Q→Z, then P→Z), LLMs can appear to perform Deductive Reasoning. But this is Mimicry, not a genuine logical deduction.

  2. āœ… Inductive Reasoning {Primary Mode}: LLMs are inherently incredible at Inductive Reasoning because they’re designed to recognize patterns. For example, when processing ā€œThe cat sat on the {BLANK},ā€ it knows to focus heavily on ā€œcatā€ and ā€œsatā€ to predict ā€œmatā€ rather than ā€œfatā€ because it draws on patterns it’s seen from similar phrases to identify likely word pairings.

  3. āŒ Abductive Reasoning {Severely Limited}: LLMs struggle with Abductive Reasoning because they lack a true understanding of how the world works beyond patterns of words. Imagine an LLM walks into a room and sees a window open, a puddle of water on the floor, and a wet cat. The LLM might say: ā€œMaybe someone spilled water on the floor then gave the cat a bath.ā€ This explanation is grammatically correct and logically sound, but the LLM overlooks the most plausible explanation because it lacks a true understanding of how cats behave and how that behavior triggers cause-and-effect outcomes.

⦿ 3ļøāƒ£ 🧩 Failure in Deductive Reasoning?

Let’s revisit the simple logic puzzle: ā€œJared has two brothers and two sisters. How many siblings does his sister Jenny have?ā€

When OpenAI’s GPT-4o (ā€œoā€ for ā€œomniā€) is asked to solve this logic puzzle, it falls short.

Since Jared is one of the brothers, the other brother and the two sisters are his siblings. So, Jared has 4 siblings.

Now, for Jenny, who is one of the sisters.

That means her total siblings are:

šŸ‘‰Jared

šŸ‘‰The Other Brother

šŸ‘‰The Other Sister

That makes 3 siblings for Jenny.

āœ… Final Answer: Jenny has 3 siblings.

āš™ļøOutput Source: OpenAI’s GPT-4o (ā€œoā€ for ā€œomniā€)/Sibling Logic Puzzle/Output #1!

GPT-4o (ā€œoā€ for ā€œomniā€) mistakenly included Jared as one of the two brothers. We intuitively know that Jared’s two brothers exclude him.

Fortunately, developers recently introduced LRMs to overcome this limitation, and they’re powering all of today’s most advanced AI models.

šŸ¤–CAN LRMs REASON?

LRMs are built to plan and reason, not just generate human-like text. They achieve this by utilizing Test-Time Compute (TTC), which allocates more computing power during AI Inference: everything that happens after you enter your prompt.

TTC leverages CoT and RL:

  1. Chain-of-Thought (CoT) to break down complex problems into manageable sub-problems. Then, to solve each manageable sub-problem and combine them into a complete solution.

  2. Reinforcement Learning (RL) to mimic the ā€œtrial-and-errorā€ process humans use to learn, where decisions that lead to desired outcomes are reinforced.

Together, CoT and RL enable LRMs to mimic certain components of Abductive Reasoning by deconstructing complex problems into manageable sub-problems and reinforcing multiple potential solutions that are logically sound.

This process is what allows OpenAI’s OpenAI o4-mini, Anthropic’s Claude Opus 4, and Google’s Gemini 2.5 Pro to mimic a human’s decision-making capabilities and problem-solving capacity.

When you ask OpenAI o4-mini to solve the same simple logic puzzle, it passes with flying colors.

Jenny has four siblings. In addition to herself, the family consists of:

šŸ‘‰Jared

šŸ‘‰Two Brothers

šŸ‘‰One Other Sister

So, Jenny’s brothers {2}, her sister {1}, and Jared {1} make {4} siblings in total.

āš™ļøOutput Source: OpenAI’s OpenAI o4-mini/Sibling Logic Puzzle/Output #1!

šŸ”‘KEY TAKEAWAY

LLMs are great at recognizing patterns, but they often fall short of understanding the logic behind those patterns. LRMs, which leverage TTC to deploy CoT and RL, allow advanced AI models to reason more like humans.

This matters because it brings us closer to confidently using advanced AI models in critical fields like law, finance, or medicine, where hallucinations can have serious consequences for people’s fundamental rights, safety, or health.

šŸ“’FINAL NOTE

FEEDBACK

How would you rate today’s email?

It helps us improve the content for you!

Login or Subscribe to participate in polls.

ā¤ļøTAIP Review of The Week

ā€œI understood all of it on the first read!!ā€

-Anna (1ļøāƒ£ šŸ‘Nailed it!)
REFER & EARN

šŸŽ‰Your Friends Learn, You Earn!

You currently have 0 referrals, only 1 away from receiving šŸŽ“3 Simple Steps to Turn ChatGPT Into an Instant Expert.