• The AI Pulse
  • Posts
  • 🧠 Generative AI in the Enterprise: Eclectic and Essential

🧠 Generative AI in the Enterprise: Eclectic and Essential

PLUS: How DataBricks Addressed Financial and Environmental Concerns About AI

Welcome back AI prodigies!

In today’s Sunday Special:

  • 🏗Framework for Enterprise Adoption

  • 🥾Generative AI on the Ground

  • 😶‍🌫️Demystifying Data for AI

  • 🙋Which LLM Is Right for Me?

  • 🤝Dolly 2.0: Eliminating Tradeoffs

Read Time: 7 minutes

🎓Key Terms

  • Large Language Models (LLMs): AI models pre-trained on vast amounts of data to generate and classify text, conversationally answer questions, and translate languages.

  • Parameter-Performance Tradeoff: as the number of parameters in a training dataset increases, the value of each additional performance improvement decreases.

  • Open Source: describes an AI model whose underlying code, including algorithm weights, is openly available for public use.

🏗FRAMEWORK FOR ENTERPRISE ADOPTION

Every business we buy from has an AI strategy, as will nearly every organization we work for, every charity we donate to, and every government agency we interact with. When leaders consider deploying new technology across an organization, they examine the following:

  1. Value Proposition: What increases in profitability or cost efficiency are likely, whether from improved operational efficiency, employee productivity, data-driven insights, or another source?

  2. Build vs. Buy: How do the financial, strategic, and opportunity costs of internally building and deploying new technology compare to the external alternative?

  3. Governance: Does this technology meet or exceed proprietary and regulatory standards of safety, security, privacy, reliability, scalability, and environmental friendliness? If not, is its value proposition justified?

🥾GENERATIVE AI ON THE GROUND

The value proposition of enterprise-wide AI, or integrating AI into at least five core business functions, is promising. Generative AI pledges to streamline employee productivity, generate data-driven insights, radically personalize the customer experience, and detect security threats.

In the context of slow-moving corporations, ubiquitous AI adoption happened overnight. In 2022, MIT Technology Review surveyed 600 senior data and technology executives worldwide, and just 8% said AI was a critical part of 3 or more business functions. Though MIT didn’t replicate the survey in the post-ChatGPT era, nearly a year after its launch, Ernst and Young found that 42% of executives expect generative AI to have a “significant impact on how internal functions operate” within 1-2 years, and an additional 37% agreed when extending the time horizon to 3-5 years.

The democratization of Large Language Models (LLMs) is almost certainly behind this shift, as they offer use cases across every function and industry: Boston Consulting Group candidates conduct case interviews with chatbots; DuPont, a chemicals pioneer, interrogates internal documents in fragmented IT architectures from recently acquired organizations; Wells Fargo specifies what information clients must provide to regulators; OpenAI auto-fills code via GitHub Copilot; Coca-Cola optimizes manufacturing processes with the world’s first industrial LLM; and Eversana, a commercial services provider for life sciences, supercharges marketing content creation via Adobe Firefly.

Explore this article for five more mind-blowing use cases, including 10,000 product ideas from an LLM that generated $100 million for Unilever.

😶‍🌫️DEMYSTIFYING DATA FOR AI

Unlike executive’s sentiments about AI, these LLMs didn’t arrive overnight. Creating and maintaining a robust data infrastructure was the first step. LLMs need data as much as vehicles need gasoline, and collecting, refining, and distributing data in various quantities and qualities is no easy feat. To accommodate the needs of AI models, data infrastructure underwent the following evolution:

  • Data Warehouse: Data warehouses are like Wikipedia. They’re mostly reliable, fail to update in real-time, and process data in batches. Also, they don’t accommodate emerging data formats, like text and image, just like Wikipedia still doesn’t support audio and video.

  • Data Lake: Data lakes are like a hoarder’s closet. They accommodate a wide range of data types but are challenging to construct. Lake data is also complex to explore because it has not been validated or cleaned up.

  • Data Lakehouse: Data lakehouses bring the best of both worlds. They leverage the flexibility and scale of data lakes with the governance and quality of data warehouses.

When training LLMs, technologists prefer data lakehouses because they minimize the need to move data across silos, preventing unauthorized employees from viewing sensitive financial data. Once the data infrastructure is in order, it’s time to train.

🙋WHICH LLM IS RIGHT FOR ME?

Organizations must make tradeoffs across personalization, security, and cost when selecting an LLM type:

  • Closed-Source LLMs: poor personalization, poor security, and low cost.

  • Domain-Specific LLMs: moderate personalization, poor security, and moderate cost.

  • Custom LLMs: high personalization, high security, and high cost.

For many, security is a top concern. Financial services firms fear leaking Material Non-Public Information (MNPI) about assets they invest in or deals in progress. Intellectual Property (IP) protection is also a priority. Social media giants must protect the blueprints for their recommendation algorithms, and pharmaceutical players keep the formulas for their patented drugs secret. Custom LLMs are integral to maintaining competitive advantage but are far from no-brainers.

Financial and environmental costs plague training efforts and model maintenance. OpenAI reportedly spent $40 million monthly to process queries following ChatGPT’s launch. According to University of Washington research, training ChatGPT required the annual electricity consumption of 1,000 U.S. households, and daily user queries utilized 33,000 households of energy. Though this is a tiny percentage of total energy consumption, widespread adoption of compute-intensive AI models is inevitable.

Yet corporations and governments have pledged carbon neutrality. President Biden signed an executive order for a 2050 deadline, and according to the Net Zero Tracker, two-thirds of the annual revenue of the world’s largest 2000 companies have net-zero targets. However, two emerging trends mitigate these financial and environmental concerns:

Many organizations favor cheaper, smaller LLMs due to the parameter-performance tradeoff. Michael Carbin, MIT Professor and Founding Advisor at MosaicML, explains: “I believe we’re going to move away from half a trillion parameters in a model to 7, 10, 30, 50 billion parameters on the data [firms] have.” Also, many express optimism about exponential decreases in solar energy costs due to theory and practice. According to Wright’s Law, which has been found to describe at least 60 technologies, doubling the cumulative production scale leads to a fixed percentage decline in close. Across the globe, solar costs dropped by a factor of 5 from 2010 to 2020, potentially reducing the cost of computing power over time.

🤝DOLLY 2.0: ELIMINATING TRADEOFFS

After honing in on AI’s value proposition and addressing governance risks, tradeoffs between the main prevailing LLM choices seem harsh. That’s why data intelligence leader Databricks pioneered Dolly, the world’s first open-instruction-tuned LLM. Trained on just 6 billion parameters (versus GPT-3’s 175 billion), Dolly is customizable, open-source, low-cost, and secure. Its child, Dolly 2.0, is the first version ready for commercial use. Companies can combine proprietary datasets with Databricks to build personalized applications they fully own. Given the governance considerations of large enterprises, open-source, customizable LLMs are the future for businesses in every industry and lifecycle stage.

📒FINAL NOTE

If you found this useful, follow us on Twitter or provide honest feedback below. It helps us improve our content.

How was today’s newsletter?

❤️AI Pulse Review of The Week

“I like the formatting of the content.”

-Dustin (⭐️⭐️⭐️⭐️⭐️Nailed it!)

🎁NOTION TEMPLATES

🚨Subscribe to our newsletter for free and receive these powerful Notion templates:

  • ⚙️150 ChatGPT prompts for Copywriting

  • ⚙️325 ChatGPT prompts for Email Marketing

  • 📆Simple Project Management Board

  • ⏱Time Tracker

Reply

or to participate.