- The AI Pulse
- Posts
- š§ Hidden Intelligence: What AI Canāt Capture
š§ Hidden Intelligence: What AI Canāt Capture
PLUS: How Carbon Credits Show Us the Limitations of Data-Based Decision Making
Welcome back AI prodigies!
In todayās Sunday Special:
āļøData Is Dumb
šMore Metrics, More Malpractice
š„·Two Types of Covert Intelligence
šKey Takeaway
Read Time: 6 minutes
šKey Terms
Large Language Models (LLMs): AI models pre-trained on vast amounts of data to generate human-like text.
Machine Learning (ML): Leverages data to recognize patterns and make predictions without explicit instructions from developers.
Encoding Human Intuition: Translating a humanās subjective, often subconscious, decision-making process into a format that machines can understand.
Reinforcement Learning From Human Feedback (RLHF): A training method that uses human feedback to teach LLMs to self-learn more efficiently and align with human preferences.
š©ŗ PULSE CHECK
When will we be able to encode human intuition in data?Vote Below to View Live Results |
āļøDATA IS DUMB
āIn four years, a child has seen 50 times more data than the biggest LLMs...Text is simply too low bandwidth and too scarce a modality to learn how the world works.ā
The world is a visually complex, multidimensional place. Although the largest commercially available LLM, Metaās Llama 3.1, was trained on roughly 11 trillion words, text data will never capture this complexity. Since data is the fuel of every LLM, every LLM-based AI application, from Writing Copilots to Answering Engines, will suffer the same limitations.
High-quality datasets can mitigate this issue. For example, many Big Tech companies seek to license Wikipedia to train their AI models because itās generally accurate, even though it doesnāt adhere to rigorous academic standards. However, even high-quality datasets, such as collections of the most widely cited scientific publications, fail to capture covert forms of intelligence. How do these limitations arise? And what types of intelligence are missing?
šMORE METRICS, MORE MALPRACTICE
Any metric, no matter how carefully constructed, has its limitations. Defining what data type to collect inherently means excluding other kinds of data. As British statistician George E. P. Box famously noted, āAll models are wrong, but some are useful.ā Regardless of size, any AI model trained on a dataset will always fail to capture some element of complex relationships.
When LLMs generate an output, it makes a series of decisions. LLMs are trained to āpredictā the next word by analyzing strings of text and producing a response in which each subsequent word is most likely to occur based on past words. In simple terms, LLMs are text-generating machines that learn sequences in the training data.
Making decisions based on data, whether by humans or LLMs, involves selecting from a range of possibilities based on available information. However, the critical difference between humans and LLMs lies in the nature of the data and the decision-making process. LLMs rely on statistical patterns, while humans consider a broader context of patterns. Oftentimes, the variable that seemed unrelated ends up predicting the outcome.
Humans?
In many industries, new metrics created new incentives, resulting in unexpected behaviors. For example, the United Nations (UN) proposed Carbon Credits to fight climate change. Carbon Credits work like permission slips for emissions. If a company exceeded its government-mandated emission limits, it had to buy Carbon Credits. This idea was supposed to incentivize companies to install renewable energy infrastructure to offset the greenhouse gases they emit. However, instead of doing this, many companies would just buy plots of forests that werenāt at risk of being cut down and claim a reduction in carbon emissions. A recent investigation into the worldās leading Carbon Credits marketplace found that 90% were āphantom creditsā that didnāt truly represent real carbon reductions in companies.
LLMs?
In healthcare, this can be particularly problematic. Nearly five years ago, a bombshell study, now cited over 1,266 times, found racial bias in a clinical algorithm used to recommend care for millions of patientsā. The mathematical formula used healthcare costs as a proxy for illness severity, where higher costs correlate to more severe illnesses. As is common in Machine Learning (ML), the algorithm was trained on past healthcare spending data, but this reflected steep income disparities between black patients and white patients. As a result, black patients with the same symptoms as white patients had to be deemed much sicker than white patients to be recommended for the same care.
What If?
Though these missteps in renewable energy efforts and healthcare initiatives may seem relatively obvious in hindsight, predicting every unintended effect is impossible. But letās assume for a moment that itās possible. In this utopian world, ML engineers can detect every flaw in an algorithm, and policymakers can predict every possible self-defeating incentive.
Even in this scenario, problem-solving approaches like LLMs relying solely on data collection, manipulation, and analysis would still fall short. Fundamentally, LLMs learn language patterns by ingesting and manipulating text data, but they donāt have a built-in BS Detector like humans do. To solve this, humans provide critical feedback to LLMs through Reinforcement Learning From Human Feedback (RLHF). However, even this training method lacks context. This lack of context can be classified into two types of covert intelligence.
š„·TWO TYPES OF COVERT INTELLIGENCE
1. Cultural Intelligence
Cultural intelligence is encoded into societal norms and practices without a strictly rational explanation. We often underestimate how much of our intelligence is cultural, and our inability to encode this type of intelligence into data can hamper an AI modelās ability to achieve our objectives.
Many cultures follow an abundance of practices without explicit scientific reasons. For example, cassava has been consumed in the Amazon Rainforest by indigenous tribes for millennia. This root vegetable is a staple food rich in carbohydrates. However, cassava can release cyanide when ingested. So, to safely consume this dangerous toxin, tribes have developed a detoxification process that involves soaking, grating, and heating. To this day, the indigenous tribes rely on their abundance of practices when consuming cassava without understanding the underlying scientific principles.
Cultural practices are formed over centuries. In fact, they can become so ingrained into our psyche that they become invisible. This is a huge reason why human and AI alignment efforts have struggled so much. We implicitly assume that the metrics we benchmark AI models against are comprehensive. In reality, they only contain partial information. The rest is our collective information that involves filling in the details without realizing it. For example, self-driving cars have struggled to navigate lane merges. When we drive, sometimes we allow cars to merge into our lane when theirs ends. But other times, we cut each other off. Self-driving cars rely on training data focused on rules and protocols, not contextual, fluid communication cues from other human drivers. Theyāre not trained to fill in these blanks, leading to underperformance.
2. Intuition-Based Intelligence
The world is 3D, but our frameworks of the world are 2D. We possess a remarkable ability to create 3D frameworks from 2D information. This remarkable ability allows us to infer form, depth, and motion from flat images by āreading between the linesā or speculating implications from data points. For example, when shopping for new clothes online, we imagine what theyād look like on us. Meteorologists examine 2D weather maps but visualize dynamics like moisture, pressure, and temperature in their minds to infer how theyāll affect precipitation.
AI models donāt have the same luxury. While itās true AI models are becoming increasingly proficient at 3D representations of complex phenomena, these virtual depictions lack contextual understanding inherent to humans. For example, image generators like OpenAIās DALL-E 3 can produce images but canāt infer quality by comparing the produced images to the real thing. For example, Redditor u/Algoartist tested this by looping OpenAIās DALL-E 3 to describe and recreate the Mona Lisa. The result was 20 different recreations, all poorly constructed in unique ways. If you merely glance at an AI-generated image, you can tell whether it matches a given benchmark. Our intuitions do a lot more heavy lifting than you might be willing to give them credit for. To be successful in most complex endeavors, the ability to look beyond the data to grasp the hidden themes or big picture is a must.
šKEY TAKEAWAY
Fundamentally, AI models are data manipulation machines. Training data only includes what humans can write, sense, film, or photograph. Hidden intelligence, like intuition, cultural practices, and implicit assumptions, cannot be encoded in data. As long as data remains context-deficient, human judgment reigns supreme. Given the infinite variety of tasks and contexts, no universal rule can tell us whether to use AI. We'll just have to judge for ourselves.
šFINAL NOTE
FEEDBACK
How would you rate todayās email?It helps us improve the content for you! |
ā¤ļøTAIP Review of The Week
āThis newsletter helps me stay informed on AIās evolution. Great stuff!ā
REFER & EARN
šYour Friends Learn, You Earn!
You currently have 0 referrals, only 1 away from receiving āļøUltimate Prompt Engineering Guide.
Refer 3 friends to learn how to š·āāļøBuild Custom Versions of OpenAIās ChatGPT.
Copy and paste this link to friends: https://theaipulse.beehiiv.com/subscribe?ref=PLACEHOLDER
Reply