The AI Pulse
Posts
🧠 Hidden Intelligence: What AI Can’t Capture

🧠 Hidden Intelligence: What AI Can’t Capture

PLUS: How Carbon Credits Show Us the Limitations of Data-Based Decision Making

Rohun Shroff
October 27, 2024

Subscribe | AI Toolkit | Meet The Team

Welcome back AI prodigies!

In today’s Sunday Special:

⚙️Data Is Dumb
📊More Metrics, More Malpractice
🥷Two Types of Covert Intelligence
🔑Key Takeaway

Read Time: 6 minutes

🎓Key Terms

Large Language Models (LLMs): AI models pre-trained on vast amounts of data to generate human-like text.
Machine Learning (ML): Leverages data to recognize patterns and make predictions without explicit instructions from developers.
Encoding Human Intuition: Translating a human’s subjective, often subconscious, decision-making process into a format that machines can understand.
Reinforcement Learning From Human Feedback (RLHF): A training method that uses human feedback to teach LLMs to self-learn more efficiently and align with human preferences.

🩺 PULSE CHECK

When will we be able to encode human intuition in data?

Vote Below to View Live Results

⚙️DATA IS DUMB

“In four years, a child has seen 50 times more data than the biggest LLMs...Text is simply too low bandwidth and too scarce a modality to learn how the world works.”

-Yann LeCunn, Chief AI Scientist at Meta.

The world is a visually complex, multidimensional place. Although the largest commercially available LLM, Meta’s Llama 3.1, was trained on roughly 11 trillion words, text data will never capture this complexity. Since data is the fuel of every LLM, every LLM-based AI application, from Writing Copilots to Answering Engines, will suffer the same limitations.

High-quality datasets can mitigate this issue. For example, many Big Tech companies seek to license Wikipedia to train their AI models because it’s generally accurate, even though it doesn’t adhere to rigorous academic standards. However, even high-quality datasets, such as collections of the most widely cited scientific publications, fail to capture covert forms of intelligence. How do these limitations arise? And what types of intelligence are missing?

📊MORE METRICS, MORE MALPRACTICE

Any metric, no matter how carefully constructed, has its limitations. Defining what data type to collect inherently means excluding other kinds of data. As British statistician George E. P. Box famously noted, “All models are wrong, but some are useful.” Regardless of size, any AI model trained on a dataset will always fail to capture some element of complex relationships.

When LLMs generate an output, it makes a series of decisions. LLMs are trained to “predict” the next word by analyzing strings of text and producing a response in which each subsequent word is most likely to occur based on past words. In simple terms, LLMs are text-generating machines that learn sequences in the training data.

Making decisions based on data, whether by humans or LLMs, involves selecting from a range of possibilities based on available information. However, the critical difference between humans and LLMs lies in the nature of the data and the decision-making process. LLMs rely on statistical patterns, while humans consider a broader context of patterns. Oftentimes, the variable that seemed unrelated ends up predicting the outcome.

Humans?

In many industries, new metrics created new incentives, resulting in unexpected behaviors. For example, the United Nations (UN) proposed Carbon Credits to fight climate change. Carbon Credits work like permission slips for emissions. If a company exceeded its government-mandated emission limits, it had to buy Carbon Credits. This idea was supposed to incentivize companies to install renewable energy infrastructure to offset the greenhouse gases they emit. However, instead of doing this, many companies would just buy plots of forests that weren’t at risk of being cut down and claim a reduction in carbon emissions. A recent investigation into the world’s leading Carbon Credits marketplace found that 90% were “phantom credits” that didn’t truly represent real carbon reductions in companies.

LLMs?

In healthcare, this can be particularly problematic. Nearly five years ago, a bombshell study, now cited over 1,266 times, found racial bias in a clinical algorithm used to recommend care for millions of patients . The mathematical formula used healthcare costs as a proxy for illness severity, where higher costs correlate to more severe illnesses. As is common in Machine Learning (ML), the algorithm was trained on past healthcare spending data, but this reflected steep income disparities between black patients and white patients. As a result, black patients with the same symptoms as white patients had to be deemed much sicker than white patients to be recommended for the same care.

What If?

Though these missteps in renewable energy efforts and healthcare initiatives may seem relatively obvious in hindsight, predicting every unintended effect is impossible. But let’s assume for a moment that it’s possible. In this utopian world, ML engineers can detect every flaw in an algorithm, and policymakers can predict every possible self-defeating incentive.

Even in this scenario, problem-solving approaches like LLMs relying solely on data collection, manipulation, and analysis would still fall short. Fundamentally, LLMs learn language patterns by ingesting and manipulating text data, but they don’t have a built-in BS Detector like humans do. To solve this, humans provide critical feedback to LLMs through Reinforcement Learning From Human Feedback (RLHF). However, even this training method lacks context. This lack of context can be classified into two types of covert intelligence.

🥷TWO TYPES OF COVERT INTELLIGENCE

1. Cultural Intelligence

Cultural intelligence is encoded into societal norms and practices without a strictly rational explanation. We often underestimate how much of our intelligence is cultural, and our inability to encode this type of intelligence into data can hamper an AI model’s ability to achieve our objectives.

Many cultures follow an abundance of practices without explicit scientific reasons. For example, cassava has been consumed in the Amazon Rainforest by indigenous tribes for millennia. This root vegetable is a staple food rich in carbohydrates. However, cassava can release cyanide when ingested. So, to safely consume this dangerous toxin, tribes have developed a detoxification process that involves soaking, grating, and heating. To this day, the indigenous tribes rely on their abundance of practices when consuming cassava without understanding the underlying scientific principles.

Cultural practices are formed over centuries. In fact, they can become so ingrained into our psyche that they become invisible. This is a huge reason why human and AI alignment efforts have struggled so much. We implicitly assume that the metrics we benchmark AI models against are comprehensive. In reality, they only contain partial information. The rest is our collective information that involves filling in the details without realizing it. For example, self-driving cars have struggled to navigate lane merges. When we drive, sometimes we allow cars to merge into our lane when theirs ends. But other times, we cut each other off. Self-driving cars rely on training data focused on rules and protocols, not contextual, fluid communication cues from other human drivers. They’re not trained to fill in these blanks, leading to underperformance.

2. Intuition-Based Intelligence

The world is 3D, but our frameworks of the world are 2D. We possess a remarkable ability to create 3D frameworks from 2D information. This remarkable ability allows us to infer form, depth, and motion from flat images by “reading between the lines” or speculating implications from data points. For example, when shopping for new clothes online, we imagine what they’d look like on us. Meteorologists examine 2D weather maps but visualize dynamics like moisture, pressure, and temperature in their minds to infer how they’ll affect precipitation.

AI models don’t have the same luxury. While it’s true AI models are becoming increasingly proficient at 3D representations of complex phenomena, these virtual depictions lack contextual understanding inherent to humans. For example, image generators like OpenAI’s DALL-E 3 can produce images but can’t infer quality by comparing the produced images to the real thing. For example, Redditor u/Algoartist tested this by looping OpenAI’s DALL-E 3 to describe and recreate the Mona Lisa. The result was 20 different recreations, all poorly constructed in unique ways. If you merely glance at an AI-generated image, you can tell whether it matches a given benchmark. Our intuitions do a lot more heavy lifting than you might be willing to give them credit for. To be successful in most complex endeavors, the ability to look beyond the data to grasp the hidden themes or big picture is a must.

🔑KEY TAKEAWAY

Fundamentally, AI models are data manipulation machines. Training data only includes what humans can write, sense, film, or photograph. Hidden intelligence, like intuition, cultural practices, and implicit assumptions, cannot be encoded in data. As long as data remains context-deficient, human judgment reigns supreme. Given the infinite variety of tasks and contexts, no universal rule can tell us whether to use AI. We'll just have to judge for ourselves.

📒FINAL NOTE

FEEDBACK

How would you rate today’s email?

It helps us improve the content for you!

❤️TAIP Review of The Week

“This newsletter helps me stay informed on AI’s evolution. Great stuff!”

-Marcus (1️⃣ 👍Nailed it!)

REFER & EARN

🎉Your Friends Learn, You Earn!

You currently have 0 referrals, only 1 away from receiving 🎓3 Simple Steps to Turn ChatGPT Into an Instant Expert.

Refer 3 friends to learn how to 👷‍♀️Build Custom Versions of OpenAI’s ChatGPT.

Copy and paste this link to friends: https://theaipulse.beehiiv.com/subscribe?ref=PLACEHOLDER