• The AI Pulse
  • Posts
  • 🧠 Plato’s Lesson on AI Alignment

🧠 Plato’s Lesson on AI Alignment

PLUS: How Many Realities Do You Think There Are?

Welcome back AI prodigies!

In today’s Sunday Special:

  • 🎯How Does AI Learn?

  • 🪨The Allegory of the Cave

  • ⚙️Convergence

  • 🔑Key Takeaway

Read Time: 7 minutes

🎓Key Terms

  • Vector Embeddings: numerical representations of data that capture semantic meaning.

  • Foundation Model: AI models trained on massive amounts of general data (e.g., text, code, images, video, and audio). They're designed to be versatile datasets that can be fine-tuned to build various AI applications.

  • Artificial General Intelligence (AGI): AI models that perform tasks as well as humans and exhibit human traits such as critical reasoning, intuition, consciousness, sentience, and emotional awareness.

  • Hypothesis Space: the set of all possible solutions an AI algorithm can generate.

🩺 PULSE CHECK

How many realities do you think there are?

Vote Below to View Live Results

Login or Subscribe to participate in polls.

🎯HOW DOES AI LEARN?

Are all AI models becoming the same? More specifically, are general-purpose AI models showing signs of extreme similarity? And will AI models, regardless of modality, produce increasingly identical outputs?

Researchers have found evidence of this phenomenon, as all AI models seem to converge toward a “platonic” representation or a unique way of understanding the world. We must realize how AI models interpret the world to know why researchers pose this question. Representation is one of the most essential words in AI, and for good reason. For every concept they learn, AI models build a representation or a compressed way of describing it that captures the key attributes.

While the vast majority of information online might seem like a jumbo of words, AI models transform these concepts into numerical representations called vector embeddings. But why vector embeddings?

This Shift to Vector Embeddings Unlocks Two Key Advantages:
  1. Concepts in Numerical Form: Machines crunch numbers. Therefore, all data needs to be numerical.

  2. Similarity: By representing information in vector form, we can measure the “distance,” or level of similarity, between different words.

Imagine the vast amount of information on the internet as a sea of concepts. To navigate this sea efficiently, AI models employ a powerful technique: vector embeddings. These embeddings translate concepts into numerical representations, like points on a high-dimensional map. This map isn’t random; it’s governed by the principle of relatedness. Concepts that share similar meanings, like “dog” and “cat,” reside closer together in this vector space because they’re non-aerial, four-legged, and domestic. Conversely, concepts with less semantic connection, like “dog” and “window,” are positioned farther apart. This positioning allows AI models to efficiently process information and identify relationships between concepts based on their proximity within the vector space.

Vector spaces work both ways. Not only do they help us teach AI models, but they can also help us uncover world patterns humans have yet to realize. For example, the Semantic Space Theory behind latent spaces is helping us discover new mappings of human emotion, as Hume.ai has proven. We’re also discovering new smells through research led by Osmo.ai that maps the world’s smells and interpolates them to discover new ones.

As it turns out, AI is better than humans at pattern matching and finding critical patterns in vast volumes of data that initially seemed oblivious to us or that we were too biased to acknowledge. Therefore, if AI models genuinely have an “unbiased” view of reality, can they observe reality just as it is? In theory, yes, but this assumes unbiased training data. For the moment, let’s consider that obstacle obsolete.

🪨THE ALLEGORY OF THE CAVE

If that’s the case, could we eventually mature our AI training so much that our foundation models evolve into the same AI model, as there is only one accurate way of interpreting reality as it is? To validate this theory, the representations of these AI models should all converge into a single representation of the world, an objectively true and universal way of displaying human information. Think of representation building as an intelligence act. The closer my representation of the world is to reality, the more I’m proving to understand it. As previously explained, an AI model’s representation of the world has, at a minimum, thousands of dimensions where similar things are closer together and dissimilar concepts are pushed apart.

However, not only does the overall distribution of concept representations matter, but also their distances. One AI model’s representation of the color “red” should be similar to other AI models in the same modality (i.e., comparing LLMs). How a Language Model and a Vision Model interpret and encode “red” should be similar, too. In other words, the distance between “red” and “blue” should be equal across AI models if they all interpret the color “red” as reality presents it.

In this inquiry, AI researchers rely on Plato’s view of reality, illuminated through his landmark work, “The Allegory of the Cave.” Here’s a summary to jog your memory:

The allegory begins with prisoners chained inside a cave. Behind them is a fire, and between the fire and the prisoners are people carrying puppets that cast shadows onto the opposite wall. The prisoners watch these shadows, believing this to be their reality.

One prisoner escapes his chains and discovers the world beyond the cave. However, he’s blinded when he returns to the cave because his eyes are accustomed to sunlight.

The chained prisoners see his blindness and believe they’ll be harmed if they attempt to leave the cave. To them, that truth isn’t worth seeking.

-Plato’s “The Allegory of the Cave.”

In particular, researchers reference Plato’s “Allegory of the Cave,” where the current data we feed AI models are the shadows—a vague representation of reality—and our previous AI systems are the prisoners—meaning they only have a partial view of life. But with scale, multitasking ability, and an allegedly more comprehensive dataset about the world, foundation models might transcend their data, eventually emerging from the cave to learn the true nature of reality.

On the other hand, one could argue that the training data is the “shadow” because humans are still oblivious to reality’s true nature. This argument would also prove that our current AI training methods, imitation learning on human data, can never reach AGI.

⚙️CONVERGENCE

Suppose AI models can one day observe reality just as it is, independent of modality (e.g., language, image, or video AI models). In that case, they should all have an identical definition of reality, as indicated by the following example: Visually, looking at this image, “fimg” and “ftext” observe the same world concept, so their representations should be identical. And they likely are. When comparing a set of LLMs against Vision Models, their alignment (i.e., how similar the inner representations are) has an almost 1:1 correlation between LLM performance and alignment to the Vision Model. In layman’s terms, the better the LLM performs, the more similar its representations become to those of other powerful Vision Models despite being presented in an entirely different modality.

This similarity is due to the narrowing of the hypothesis space, which is visually represented here. As the AI model has to find a common way to solve more than one problem, the space of possible solutions to both tasks becomes smaller. Consequently, the larger the AI model and the broader the set of skills it’s trained for, the more these AI models tend to converge into each other, independently of the modality and datasets used, painting the possibility that, one day, all our frontier labs will converge into creating the same AI model.

🔑KEY TAKEAWAY

Convergence will commoditize the market for foundation models. As a result, companies building specialized AI models, tools for AI model fine-tuning, and AI-powered applications will make money. Practically speaking, reality-matching AI models probably won’t exist because the future is fundamentally distinct from the past. Philosophically, however, “omniscient” AI models may be the first step on the long road to AGI. Time will tell.

📒FINAL NOTE

If you found this useful, follow us on Twitter or provide honest feedback below. It helps us improve our content.

How was today’s newsletter?

❤️TAIP Review of the Week

“As an 8th grade English teacher, I really appreciate the links to AI tools that are education-centric.”

-Mr. House (⭐️⭐️⭐️⭐️⭐️Nailed it!)
REFER & EARN

🎉Your Friends Learn, You Earn!

You currently have 0 referrals, only 1 away from receiving ⚙️Ultimate Prompt Engineering Guide.

Refer 5 friends to enter 🎰July’s $200 Gift Card Giveaway.

Reply

or to participate.