The AI Pulse
Posts
🧠 How Will AI Navigate Uncertainty?

🧠 How Will AI Navigate Uncertainty?

PLUS: Why Hurricane Forecasting Errors Dropped by 60%

Rohun Shroff
May 18, 2025

Subscribe | AI Toolkit | Meet The Team

Welcome back AI prodigies!

In today’s Sunday Special:

📜The Prelude
👷Deconstructing Intelligence
🤔Dealing With Uncertainty
🍰Layered Control
🔑Key Takeaway

Read Time: 7 minutes

🎓Key Terms

AI Agents: Virtual employees who can autonomously plan, execute, and refine their actions.
Humanoid Robot: A robot designed to resemble the human body in form and function.
Deep Learning (DL): Mimics the human brain by creating multiple layers of “artificial” neurons to solve complex problems.
Generative AI (GenAI): When AI models create new content such as text, images, audio, video, or code.
Graphics Processing Units (GPUs): Specialized computer chips capable of performing mathematical calculations simultaneously.
Convolutional Neural Networks (CNNs): A network of spatial layers that detect complex visual patterns, such as edges, textures, and structures.

🩺 PULSE CHECK

Will we ever be able to predict the future?

Vote Below to View Live Results

📜THE PRELUDE

No matter how powerful AI becomes, uncertainty is here to stay.

Unlike humans, who navigate uncertainty by developing intuition and judgment, AI still lacks the ability to understand the world beyond patterns buried in datasets.

It lacks the human instincts that turn uncertainty into insight. AI can statistically optimize for numerical outcomes, but still doesn’t truly understand the feelings of risk, nuance, or ambiguity.

So, what can human forecasters teach us about operating under uncertainty? How can we apply similar principles to build safer AI Systems?

👷DECONSTRUCTING INTELLIGENCE

🤖AI’s Intelligence, Outlined.

Comparing the intelligence of an astronaut, a blacksmith, and a chess champion makes little sense. Each applies a distinct set of skills to solve very different kinds of problems. But defining human intelligence is even harder, with Psychologists and Cognitive Scientists struggling to agree on what it actually means.

Things are no different in the realm of AI. AI’s intelligence is far too nuanced to serve as a reliable basis for predicting the technology’s future impact. Instead, we separate AI’s intelligence into two concrete dimensions:

Capability: How well an AI System performs a specific complex task.
Impact: How well the capabilities of an AI System translate into real-world impact.

🦾Capability, Impact, and Humans.

AI Systems exhibit remarkable Capability, but the technology’s Impact suffers when acting in the real world without human oversight. For example, Johnson & Johnson (J&J) allowed employees to leverage GenAI to streamline core job skills, like discovering chemical compounds, to cut production times for new drugs by 50%. In contrast, it was a total disaster when Researchers at Carnegie Mellon University (CMU) deployed a fake software company run entirely with AI Agents. The best-performing AI Agent struggled to finish just 24% of assigned workflows. Even this poor performance was shockingly expensive, costing $6.10 per step with an average of 30 steps per workflow. Each AI Agent exhibited a lack of common sense, weak social skills, and a poor understanding of how to navigate websites.

📊Closing the Gap?

Closing the gap between Capability and Impact requires intuition and judgement. The real world is full of ambiguity, exceptions to rules, and shifting incentives. Before exploring how AI Systems can bridge this gap, let’s examine how humans deal with uncertainty.

🤔DEALING WITH UNCERTAINTY

💭Forecasting World Events.

In 2011, the University of Pennsylvania (UPenn) launched the Good Judgment Project (GJP) to test the limits of human predictive capability. It was led by American Psychologist Philip E. Tetlock, who recruited thousands of volunteers to predict real-world geopolitical events and economic affairs. These volunteers were asked to assign probabilities to questions like “Will North Korea launch a ballistic missile before December 1, 2025?” or “Will the Eurozone enter a recession next year?” Over time, the GJP identified a small subset of volunteers, later dubbed superforecasters, who consistently outperformed Intelligence Analysts and Betting Markets.

These superforecasters earned Brier Scores between 0.20 and 0.22. A Brier Score measures how the predicted probability of an event differs from the event’s actual outcome. 0 means you were 100% confident and right, and 1 means you were 100% confident and wrong. Even the best superforecasters couldn’t break through the barrier of uncertainty. However, their characteristics showed that with the right mindset and method, it’s possible to predict the future with greater confidence.

💨Forecasting Hurricane Movements.

Ensemble Numerical Weather Prediction (NWP) Systems use DL frameworks to integrate vast amounts of real-time data from aircraft, radars, satellites, ocean buoys, and historical weather records. The NWP Systems run hundreds of simulations of a hurricane’s potential path and intensity, each starting from slightly different initial atmospheric conditions. This process, called Ensemble Forecasting, helps capture the inherent uncertainty in the atmosphere’s chaotic behavior.

In weather forecasting, seemingly insignificant differences in the initial atmospheric conditions can lead to vastly different outcomes over time, referred to as the Butterfly Effect. No amount of data crunching can entirely overcome this fundamental limitation. Nevertheless, hurricane forecasts have improved dramatically in recent years.

In 2011, the average prediction of the location of a hurricane’s eye within a 3-day period was wrong by 161 miles. By 2023, it had dropped to around 69 miles. So, what changed? In the early 2000s, the Capability of AI Systems were weak. We lacked the computational resources, like GPUs, and high-quality data sources, like labeled satellite imagery datasets, to train DL frameworks to predict hurricane movements.

Today, a commonly used DL framework in weather forecasting is CNNs, which scan thousands of satellite images to spot subtle visual patterns like tiny shifts in cloud formations. Feeding this nuanced, high-resolution satellite imagery to CNNs creates a far more accurate snapshot of the hurricane’s current state. Despite these improvements, the average prediction of the location of a hurricane’s eye within a 3-day period remains wrong by about 69 miles today. Meteorologists call this the Cone of Uncertainty, a cone-shaped area on the weather forecasting map that includes all the places the hurricane’s eye could conceivably go.

🍦The Cone of Uncertainty?

In all predictive pursuits, the Cone of Uncertainty highlights where potential outcomes lie. The outputs of an AI System fall within their own Cone of Uncertainty. So, how can we control this Cone of Uncertainty without diminishing the quality of an AI System’s outputs?

🍰LAYERED CONTROL

Instead of reinventing the wheel, we can leverage three proven techniques to manage the risks of AI and build safer AI Systems:

Circuit Breakers: We can embed Circuit Breakers (i.e., automatic mechanisms that pause or shut down an automated command when error rates exceed predefined thresholds) directly into AI Systems. For example, a High‑Frequency Trading (HFT) platform might execute thousands of trades per second. Still, a parallel automated framework continually tracks metrics like Cumulative Drawdown (i.e., the total decline in value of an investment) and Execution Latency (i.e., the time between submitting a trade and the trade being executed). If a trade portfolio’s Cumulative Drawdown exceeds 15% or Execution Latency spikes above 50 milliseconds (ms), the Circuit Breaker instantly halts all trading activity.
Least-Privilege Access: We can apply Least‑Privilege Access (i.e., restricting permissions to the minimum access necessary for a specific task) to ensure that AI Agents only access the exact datasets and control mechanisms they need. An administrative AI Agent might generate official progress summaries by accessing specific task dashboards. Under Least-Privilege Access, the administrative AI Agent would be denied access to internal chat logs, even if they include messages about specific task deadlines, because those internal chat logs aren’t necessary to generate official progress summaries, and accessing them would exceed the administrative AI Agent’s scope of responsibility.
Hierarchical Oversight: We can implement Hierarchical Oversight (i.e., a multi‑tiered structure where simpler, highly reliable AI Systems monitor and intervene on more complex, less predictable AI Systems) to make Humanoid Robots safer. Consider an industrial Humanoid Robot moving packages across a warehouse. If the Humanoid Robot attempts a high-speed maneuver near a human worker, the lightweight safety AI System would instantly override the command to stop the Humanoid Robot, optimizing route efficiency while still prioritizing human worker safety.

By integrating Circuit Breakers, Least-Privilege Access, and Hierarchical Oversight, we can bridge the gap between Capability and Impact.

🔑KEY TAKEAWAY

The raw Capability of AI is impressive, but uncertainty in the real world can quickly undermine it. The Impact of AI depends on built-in safeguards to catch mistakes before they cause harm. In an increasingly uncertain world, it’s critical to mitigate the risks of AI to ensure the technology aligns with human objectives.

📒FINAL NOTE

FEEDBACK

How would you rate today’s email?

It helps us improve the content for you!

❤️TAIP Review of The Week

“Brilliant. Loved it!😊”

-John Doe (1️⃣ 👍Nailed it!)

REFER & EARN

🎉Your Friends Learn, You Earn!

You currently have 0 referrals, only 1 away from receiving 🎓3 Simple Steps to Turn ChatGPT Into an Instant Expert.

Share your unique referral link: https://theaipulse.beehiiv.com/subscribe?ref=PLACEHOLDER