- The AI Pulse
- Posts
- š§ ChatGPTās Creativity Is Overrated
š§ ChatGPTās Creativity Is Overrated
PLUS: 3 Obstacles Preventing Chatbots From Matching Human Creativity

Welcome back AI prodigies!
In todayās Sunday Special:
šØIs Generative AI Creative?
š¦¾GPT-4 vs. College Students
š§Three Obstacles
šKey Takeaway
Read Time: 6 minutes
šKey Terms
Torrance Tests for Creative Thinking: a creative assessment that tests convergent and divergent thinking through various verbal and nonverbal tasks.
Retrieval-Augmented Generation (RAG): a framework designed to make language models more reliable and accurate by pulling relevant, up-to-date data directly related to a userās query from a source (e.g., a scientific journal or news article).
šØIS GENERATIVE AI CREATIVE?
Some researchers think so. Last year, a professor pitted GPT-4 against college students in the Torrance Tests for Creative Thinking, the most widely used creativity assessment. Before we share the results and their potential implications, letās define creativity: It requires both novelty and utility. It combines existing things in a new and helpful way or produces entirely new things that serve a purpose. But thereās something abstract, perhaps even magical, about how we create novel ideas. Weāve all experienced the āAha!ā moment, but discerning where it came from and how to replicate that process is nearly impossible.
š¦¾GPT-4 VS. COLLEGE STUDENTS
The Torrance Tests contain three sections, each with a myriad of challenges. Each task has a time limit based on age, test objective, and other factors.
Verbal Tasks Using Verbal Stimuli:
Impossibilities: List as many impossibilities as possible.
Just Suppose: Confronted with an unlikely scenario, subjects must predict potential outcomes. New variables will be introduced throughout the exercise to influence their predictions.
Verbal Tasks Using Non-Verbal Stimuli:
Ask and Guess: Ask non-obvious questions about a picture. Hypothesize the causes and effects of the scenario in the picture.
Unusual Uses: Think of the most clever, engaging, and uncommon uses of a toy or any object.
Non-verbal Tasks (i.e., Excluded From the GPT-4 vs. College Students Duel):
Circles and Squares: On a page with 42 circles of equal size, sketch objects or pictures that use circles. Repeat for squares.
Incomplete Figures: A page contains ten squares containing a different stimulus drawing. Sketch objects or designs by adding as many lines as possible to the ten figures.
Results are scored based on four categories: fluency, flexibility, originality, and elaboration. Fluency describes the total number of interpretable, meaningful, and relevant ideas generated, and flexibility refers to the number of different categories of appropriate responses. How do you think GPT-4 fared against college students?
š©ŗ PULSE CHECK
What percent of college students did GPT-4 beat in creativity?Vote Below to View Answer |
š§THREE OBSTACLES
If youāve tinkered with OpenAIās ChatGPT, this shouldnāt be too surprising. The AI model read the web, remembered what it read, and somewhat generated the most likely words to follow each prior word. Despite this impressive performance, chatbots have severe limitations, preventing them from replacing humans in any creative situation.
They Canāt Apply Creative Output: Letās say you ask OpenAIās ChatGPT to develop 20 names for a clothing business. You have to check if theyāre taken and culturally appropriate. Also, the suggestions donāt reflect your personal experiences. Chatbot output is, at best, a starting point for more complex questions, like vacation planning. When I asked for a four-week European itinerary, Googleās Gemini failed to include links to accommodations, transportation, and restaurants I specifically asked for. It can describe destinations in flowery language, but it canāt help book anything.
They Sacrifice Utility for Accuracy: Often, the most accurate response wonāt be actionable for the user. When asked to list former President Trumpās indictments, Googleās Gemini failed to answer, directing users to Google Search. OpenAIās ChatGPT, on the other hand, listed a few of them, excluding some. Although accuracy is essential, no one wants to use a product that doesnāt give them the information they want.
They Hallucinate: Although OpenAIās ChatGPT hallucination rate of 3% beats its competitors, hallucinations become much more frequent as queries get more complex. Double-checking outputs via Google Search or external sources is necessary for high-stakes endeavors like school or work. Developers address this through an AI framework called Retrieval-Augmented Generation (RAG). Instead of relying on vast training data to generate a response, RAG-enabled chatbots pull information from smaller, high-quality datasets, like Wikipedia, published research papers, or legal documents. Although incorporating RAG into your chatbot requires technical know-how, manual implementation is also possible. Paste the text you want it to reference into the prompt and ask the chatbot to reference it. Now, the size of your high-quality dataset is limited to the extent of the prompt, but responses should be more accurate.
šKEY TAKEAWAY
Weāre still in the earliest innings of conversational AI. Chatbots can only reason by analogyārepeating or rewording past writing. They canāt generate novel, useful, and feasible ideas, never mind accurate ones. At least not yet. Experts disagree on whether the hallucination problem is solvable. But even with 100% accuracy, chatbotsā creative outputs have severe limitations. In narrow, structured assessments, theyāre quick brainstormers. But most problems require a combination of unprogrammable skills and knowledge, for now.
šFINAL NOTE
If you found this useful, follow us on Twitter or provide honest feedback below. It helps us improve our content.
How was todayās newsletter?
ā¤ļøAI Pulse Review of The Week
āItās always a great read, with simple and clear sections.ā
šNOTION TEMPLATES
šØSubscribe to our newsletter for free and receive these powerful Notion templates:
āļø150 ChatGPT prompts for Copywriting
āļø325 ChatGPT prompts for Email Marketing
šSimple Project Management Board
ā±Time Tracker
Reply