• The AI Pulse
  • Posts
  • šŸ¤– Harvard Makes 1 Million Books Available to Train AI Models

šŸ¤– Harvard Makes 1 Million Books Available to Train AI Models

PLUS: Sixth Day of ā€œ12 Days of OpenAIā€ Event, MITā€™s ā€œContextCiteā€ Verifies AI-Generated Content

Welcome back AI enthusiasts!

In todayā€™s Daily Report:

  • šŸ“–Harvard Makes 1 Million Books Available to Train AI Models

  • šŸŽ„Sixth Day of ā€œ12 Days of OpenAIā€ Event

  • šŸŽ“MITā€™s ā€œContextCiteā€ Verifies AI-Generated Content

  • šŸ› Trending Tools

  • šŸ’°Funding Frontlines

  • šŸ’¼Whoā€™s Hiring?

Read Time: 3 minutes

šŸ—žRECENT NEWS

AI TRAINING

šŸ“–Harvard Makes 1 Million Books Available to Train AI Models

Image Source: Canvaā€™s AI Image Generators/Magic Media

Collecting high-quality datasets to train AI models is expensive; only Tech Giants like Apple, Salesforce, and Microsoft can afford it. But what if everyone could afford it?

Key Details:
  • Harvard University (ā€œHarvardā€) is releasing a massive, high-quality dataset of over a million public-domain books to ā€œlevel the playing field.ā€

  • It contains a wide range of authors, genres, and languages. For instance, it includes Charles Dickens, Dante Alighieri, and William Shakespeare. These authors are no longer protected by copyright due to their age.

  • This project is part of the Institutional Data Initiative (IDI), which aims to work with Knowledge Institutions, such as libraries, museums, and cultural groups, to harness their collections of knowledge into data for training AI models.

Why Itā€™s Important:
  • Reddit signed a licensing agreement with Google worth $60 million annually to provide user-generated content from Subreddits to train Gemini.

  • Small AI firms donā€™t just have $60 million lying around. So, having access to a massive, high-quality dataset of over a million books enables them to train their AI models without breaking the bank.

šŸ©ŗ PULSE CHECK

Does this project help level the playing field between smaller AI firms and Tech Giants?

Vote Below to View Live Results

Login or Subscribe to participate in polls.

OPENAI

šŸŽ„Sixth Day of ā€œ12 Days of OpenAIā€ Event

Image Source: OpenAI/YouTube/ā€œSanta Mode & Video in Advanced Voice-12 Days of OpenAI: Day 6ā€/Screenshot

OpenAI showcased new vision capabilities for ChatGPTā€™s Advanced Voice Mode during the sixth day of the ā€œ12 Days of OpenAIā€ event.

Key Details:
  • Advanced Voice Mode is a cool feature that allows you to have conversations with ChatGPT.

  • However, youā€™ve only been able to interact with it through your voice. Now, you can interact with it through your video.

  • In other words, it can access your smartphoneā€™s camera to chat about what you see.

  • Imagine trying to learn more about a painting at an art gallery. Advanced Voice Mode can access your smartphoneā€™s camera to analyze the paintingā€™s style and subject matter to discuss the artistā€™s vision with you.

  • A new Screen Share Function also enables Advanced Voice Mode to see whatā€™s on your smartphoneā€™s screen to help you respond to texts or emails.

  • Advanced Voice Mode also has a new ā€œSantaā€ voice option that discusses life at the North Pole or tells stories about saving Christmas.

Why Itā€™s Important:
  • For individuals with visual impairments, the ability to interact with ChatGPT through your voice and your video can be transformative.

  • Advanced Voice Mode can help you prepare for interviews, be a late-night study buddy, or be a best friend who gives you opinions on decisions.

AI RESEARCH

šŸŽ“MITā€™s ā€œContextCiteā€ Verifies AI-Generated Content

Image Source: MIT Computer Science and Artificial Intelligence Laboratory (CSAIL)/ā€œContextCite: Attributing Model Generation to Contextā€/Screenshot

Imagine using a conversational chatbot like OpenAIā€™s ChatGPT to prepare for a final exam. To achieve this, you might provide the relevant textbook and study guides as context, allowing ChatGPT to interact with this context to answer your questions. For example, ā€œCan you outline the key takeaways of Opportunity Cost in Chapter One?ā€

However, after seeing a generated response, you might ask yourself, ā€œIs everything accurate? Did ChatGPT misinterpret any of the context? Is the generated response actually grounded in the context?ā€ Manually answering these questions would be time-consuming and counterintuitive. Youā€™d need to read Chapter One and examine ChatGPTā€™s generated response.

Researchers at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) created ā€œContextCite,ā€ a tool that locates the context in the generated response. In other words, this tool shows you exactly which parts of Chapter One ChatGPT used to outline the key takeaways of Opportunity Cost. More importantly, it tells you if the generated response is inaccurate or isnā€™t grounded in the context.

šŸ› TRENDING TOOLS

šŸ—£ļøShortcut works at the speed of your voice.

šŸ§ŠAISmartCube builds AI tools in hours with no code.

āš™ļøSmythOS builds, debugs, and deploys AI agents in minutes.

šŸ“ŠBricks is an AI-powered spreadsheet that generates formulas.

šŸ’¬Remention places your product in billions of online discussions with AI.

šŸ”®Browse our always Up-To-Date AI Tools Database.

šŸ’°FUNDING FRONTLINES

  • Neubird lands a $22.5M Funding Round to build the worldā€™s first AI-powered ITOps engineer.

  • AyarLabs secures a $155M Funding Round to address the need for a cost-effective AI infrastructure.

  • Prezent.AI raises a $7.3M Funding Round to improve business communications with AI-powered storytelling solutions.

šŸ’¼WHOā€™S HIRING?

  • Intel (Santa Clara, CA): Software Engineer Intern, Summer 2025

  • Formlabs (Somerville, MA): Embedded Software Intern, Summer 2025

  • Riot Games (Remote): Research Scientist Intern, Generative AI (GenAI), Summer 2025

  • NVIDIA (Santa Clara, CA): Enterprise Marketing Campaigns Intern, Summer 2025

  • Docusign (San Francisco, CA): Platform Software Engineer Intern, Summer 2025

šŸ¤–PROMPT OF THE DAY

CUSTOMERS

šŸ›’Customer Satisfaction Survey

Act as a market research specialist and create a customer satisfaction survey for [Small Business] with [Product or Service] and [Target Audience] that collects actionable insights through quantitative and qualitative questions as well as open-ended feedback.

Small Business = [Insert Here]

Product or Service = [Insert Here]

Target Audience = [Insert Here]

šŸ“’FINAL NOTE

FEEDBACK

How would you rate todayā€™s email?

It helps us improve the content for you!

Login or Subscribe to participate in polls.

ā¤ļøTAIP Review of The Day

ā€œRead these everyday. Great way to stay updated on the ever evolving tech/ai world :)ā€

-James (1ļøāƒ£ šŸ‘Nailed it!)
REFER & EARN

šŸŽ‰Your Friends Learn, You Earn!

You currently have 0 referrals, only 1 away from receiving āš™ļøUltimate Prompt Engineering Guide.

Refer 3 friends to learn how to šŸ‘·ā€ā™€ļøBuild Custom Versions of OpenAIā€™s ChatGPT.

Reply

or to participate.