• The AI Pulse
  • Posts
  • šŸ§  Hidden Intelligence: What AI Canā€™t Capture

šŸ§  Hidden Intelligence: What AI Canā€™t Capture

PLUS: How Carbon Credits Show Us the Limitations of Data-Based Decision Making

Welcome back AI prodigies!

In todayā€™s Sunday Special:

  • āš™ļøData Is Dumb

  • šŸ“ŠMore Metrics, More Malpractice

  • šŸ„·Two Types of Covert Intelligence

  • šŸ”‘Key Takeaway

Read Time: 6 minutes

šŸŽ“Key Terms

  • Large Language Models (LLMs): AI models pre-trained on vast amounts of data to generate human-like text.

  • Machine Learning (ML): Leverages data to recognize patterns and make predictions without explicit instructions from developers.

  • Encoding Human Intuition: Translating a humanā€™s subjective, often subconscious, decision-making process into a format that machines can understand.

  • Reinforcement Learning From Human Feedback (RLHF): A training method that uses human feedback to teach LLMs to self-learn more efficiently and align with human preferences.

šŸ©ŗ PULSE CHECK

When will we be able to encode human intuition in data?

Vote Below to View Live Results

Login or Subscribe to participate in polls.

āš™ļøDATA IS DUMB

ā€œIn four years, a child has seen 50 times more data than the biggest LLMs...Text is simply too low bandwidth and too scarce a modality to learn how the world works.ā€

-Yann LeCunn, Chief AI Scientist at Meta.

The world is a visually complex, multidimensional place. Although the largest commercially available LLM, Metaā€™s Llama 3.1, was trained on roughly 11 trillion words, text data will never capture this complexity. Since data is the fuel of every LLM, every LLM-based AI application, from Writing Copilots to Answering Engines, will suffer the same limitations.

High-quality datasets can mitigate this issue. For example, many Big Tech companies seek to license Wikipedia to train their AI models because itā€™s generally accurate, even though it doesnā€™t adhere to rigorous academic standards. However, even high-quality datasets, such as collections of the most widely cited scientific publications, fail to capture covert forms of intelligence. How do these limitations arise? And what types of intelligence are missing?

šŸ“ŠMORE METRICS, MORE MALPRACTICE

Any metric, no matter how carefully constructed, has its limitations. Defining what data type to collect inherently means excluding other kinds of data. As British statistician George E. P. Box famously noted, ā€œAll models are wrong, but some are useful.ā€ Regardless of size, any AI model trained on a dataset will always fail to capture some element of complex relationships.

When LLMs generate an output, it makes a series of decisions. LLMs are trained to ā€œpredictā€ the next word by analyzing strings of text and producing a response in which each subsequent word is most likely to occur based on past words. In simple terms, LLMs are text-generating machines that learn sequences in the training data.

Making decisions based on data, whether by humans or LLMs, involves selecting from a range of possibilities based on available information. However, the critical difference between humans and LLMs lies in the nature of the data and the decision-making process. LLMs rely on statistical patterns, while humans consider a broader context of patterns. Oftentimes, the variable that seemed unrelated ends up predicting the outcome.

Humans?

In many industries, new metrics created new incentives, resulting in unexpected behaviors. For example, the United Nations (UN) proposed Carbon Credits to fight climate change. Carbon Credits work like permission slips for emissions. If a company exceeded its government-mandated emission limits, it had to buy Carbon Credits. This idea was supposed to incentivize companies to install renewable energy infrastructure to offset the greenhouse gases they emit. However, instead of doing this, many companies would just buy plots of forests that werenā€™t at risk of being cut down and claim a reduction in carbon emissions. A recent investigation into the worldā€™s leading Carbon Credits marketplace found that 90% were ā€œphantom creditsā€ that didnā€™t truly represent real carbon reductions in companies.

LLMs?

In healthcare, this can be particularly problematic. Nearly five years ago, a bombshell study, now cited over 1,266 times, found racial bias in a clinical algorithm used to recommend care for millions of patientsā€Š. The mathematical formula used healthcare costs as a proxy for illness severity, where higher costs correlate to more severe illnesses. As is common in Machine Learning (ML), the algorithm was trained on past healthcare spending data, but this reflected steep income disparities between black patients and white patients. As a result, black patients with the same symptoms as white patients had to be deemed much sicker than white patients to be recommended for the same care.

What If?

Though these missteps in renewable energy efforts and healthcare initiatives may seem relatively obvious in hindsight, predicting every unintended effect is impossible. But letā€™s assume for a moment that itā€™s possible. In this utopian world, ML engineers can detect every flaw in an algorithm, and policymakers can predict every possible self-defeating incentive.

Even in this scenario, problem-solving approaches like LLMs relying solely on data collection, manipulation, and analysis would still fall short. Fundamentally, LLMs learn language patterns by ingesting and manipulating text data, but they donā€™t have a built-in BS Detector like humans do. To solve this, humans provide critical feedback to LLMs through Reinforcement Learning From Human Feedback (RLHF). However, even this training method lacks context. This lack of context can be classified into two types of covert intelligence.

šŸ„·TWO TYPES OF COVERT INTELLIGENCE

1. Cultural Intelligence

Cultural intelligence is encoded into societal norms and practices without a strictly rational explanation. We often underestimate how much of our intelligence is cultural, and our inability to encode this type of intelligence into data can hamper an AI modelā€™s ability to achieve our objectives.

Many cultures follow an abundance of practices without explicit scientific reasons. For example, cassava has been consumed in the Amazon Rainforest by indigenous tribes for millennia. This root vegetable is a staple food rich in carbohydrates. However, cassava can release cyanide when ingested. So, to safely consume this dangerous toxin, tribes have developed a detoxification process that involves soaking, grating, and heating. To this day, the indigenous tribes rely on their abundance of practices when consuming cassava without understanding the underlying scientific principles.

Cultural practices are formed over centuries. In fact, they can become so ingrained into our psyche that they become invisible. This is a huge reason why human and AI alignment efforts have struggled so much. We implicitly assume that the metrics we benchmark AI models against are comprehensive. In reality, they only contain partial information. The rest is our collective information that involves filling in the details without realizing it. For example, self-driving cars have struggled to navigate lane merges. When we drive, sometimes we allow cars to merge into our lane when theirs ends. But other times, we cut each other off. Self-driving cars rely on training data focused on rules and protocols, not contextual, fluid communication cues from other human drivers. Theyā€™re not trained to fill in these blanks, leading to underperformance.

2. Intuition-Based Intelligence

The world is 3D, but our frameworks of the world are 2D. We possess a remarkable ability to create 3D frameworks from 2D information. This remarkable ability allows us to infer form, depth, and motion from flat images by ā€œreading between the linesā€ or speculating implications from data points. For example, when shopping for new clothes online, we imagine what theyā€™d look like on us. Meteorologists examine 2D weather maps but visualize dynamics like moisture, pressure, and temperature in their minds to infer how theyā€™ll affect precipitation.

AI models donā€™t have the same luxury. While itā€™s true AI models are becoming increasingly proficient at 3D representations of complex phenomena, these virtual depictions lack contextual understanding inherent to humans. For example, image generators like OpenAIā€™s DALL-E 3 can produce images but canā€™t infer quality by comparing the produced images to the real thing. For example, Redditor u/Algoartist tested this by looping OpenAIā€™s DALL-E 3 to describe and recreate the Mona Lisa. The result was 20 different recreations, all poorly constructed in unique ways. If you merely glance at an AI-generated image, you can tell whether it matches a given benchmark. Our intuitions do a lot more heavy lifting than you might be willing to give them credit for. To be successful in most complex endeavors, the ability to look beyond the data to grasp the hidden themes or big picture is a must.

šŸ”‘KEY TAKEAWAY

Fundamentally, AI models are data manipulation machines. Training data only includes what humans can write, sense, film, or photograph. Hidden intelligence, like intuition, cultural practices, and implicit assumptions, cannot be encoded in data. As long as data remains context-deficient, human judgment reigns supreme. Given the infinite variety of tasks and contexts, no universal rule can tell us whether to use AI. We'll just have to judge for ourselves.

šŸ“’FINAL NOTE

FEEDBACK

How would you rate todayā€™s email?

It helps us improve the content for you!

Login or Subscribe to participate in polls.

ā¤ļøTAIP Review of The Week

ā€œThis newsletter helps me stay informed on AIā€™s evolution. Great stuff!ā€

-Marcus (1ļøāƒ£ šŸ‘Nailed it!)
REFER & EARN

šŸŽ‰Your Friends Learn, You Earn!

You currently have 0 referrals, only 1 away from receiving āš™ļøUltimate Prompt Engineering Guide.

Refer 3 friends to learn how to šŸ‘·ā€ā™€ļøBuild Custom Versions of OpenAIā€™s ChatGPT.

Reply

or to participate.