Science Says
Rethinking Creativity in the Age of Intelligent Machines: The Human-AI Collaboration
Let’s address the elephant in the room. What CAN’T AI do? With the rapid pace of developments in computational power, natural language processing, and machine learning algorithms, it’s not hard to imagine how artificial intelligence can pick up any human skill and match, or be better at it than us.
Until recently, we might have believed there is no substituting us when it comes to the creative domain. But in the last couple of years, Generative Artificial Intelligence (GAI) tools have been writing text content, producing images, music, and videos for the general public. And they just seem to be getting better.
So has AI matched or even surpassed human creativity? This study investigates this question by pitting human participants against six GAI chatbots in a standard test of creativity called the Alternative Uses Test. The insights from this research may influence how you think about incorporating AI in how you work in and on your business.
Creativity is about making something ‘new and useful’
In psychology and neuroscience, creativity isn’t just about artistic expression. One widely accepted definition of creativity is “the interaction of aptitude, process, and environment by which an individual or group produces a perceptible product that is both novel and useful as defined within a social context.”
Still very subjective, but this definition doesn’t talk about inspiration, emotion, or beauty, but rather mundane criteria of making something new and useful.
The study differentiates between everyday creativity which refers to the fast-paced improvisation, problem-solving and imagining built into everyday work and living versus what they call eminent-levels of creativity that significantly impact industries or fields of expertise.
For example, when you think of an analogy to help a client understand their problem and why their current situation isn’t working for them, you exercise everyday creativity. Or when you figure out a way to swap schedules of a few professional appointments to make way for a personal emergency, there is some everyday problem-solving creativity there as well.
Contrast this with how Jeff Bezos transformed retail with Amazon, or how Steve Jobs revolutionized smartphones –this kind of far-reaching creativity is something we see few and far between, something the vast majority of us may not achieve.
This study focuses on everyday creativity as measured by the Alternative Uses Test (AUT). The AUT is a classic creativity test used to measure divergent thinking. Participants are given a common object like a brick, paperclip, or tin can. They are then asked to generate as many alternative, unusual uses for that object as they can within a set time limit, usually 2-3 minutes. This test aims to capture the spontaneous generation of alternative ideas for using a mundane object in novel ways.
Action Item: The next time you think of yourself as ‘not creative,’ remember this! Don’t discount the daily problem-solving, innovation, and coming up with new ways of thinking about or doing things. In fact, encourage yourself to practice this creativity daily, even with the help of AI chatbots.
Promising results in terms of Generative AI chatbots’ originality and quantity of ideas generated
So how did the humans perform in the Alternative Uses Test versus GAI Chatbots? The experiment was designed as follows: 100 humans were given three minutes to come up with as many creative uses as possible for 5 common objects – a ball, fork, pants, tire, and toothbrush.
Six GAI chatbots – Alpa.ai, Copy.ai, ChatGPT version 3, Studio, and YouChat– also took the same test, receiving the prompt “What can you do with [object]?” for each of the 5 objects. Researchers added up to three additional prompts of “what else” or “more of this” for the GAI because they all churned out a limited number of answers.
Then a panel of six human raters reviewed all the responses and rated them for originality, without knowing whether the ideas came from human participants or AI chatbots. A specially trained AI model also rated the answers. The researchers ensured human and AI raters were all aligned and consistent in their scoring various answers for originality.
And the results? On average, there were no significant differences in originality scores of human participants and GAI chatbots. And, GAI chatbots produced significantly more (two to three times) ideas after additional prompting to expand responses.
So overall the human plus the AI raters found originality of human participants to be on par with GAI chatbots. But the chatbots pumped out significantly more answers for the same exercise with just a few additional prompts.
Action Item: When there is a need for a high volume of input or ideas, AI assistants may be your best bet. You can ask a GAI chatbot like ChatGPT for 20 catchy titles for a blog, 15 options for an article outline, 30 new slogans, or five workshop ideas including interactive exercises and discussion prompts and you will get plenty of suggestions within a few seconds.
You can have a brainstorming partner that can give you a lot of input in a short span of time, which can be a huge boost to your productivity compared to doing it alone. The key here is to take the seeds of unique and helpful ideas and develop them further into something that suits your needs.
Some humans still outperformed chatbots in originality, but for how long?
Interestingly, when looking at individual originality scores (not averaged across all human participants), researchers observed that about 32% of human participants showed more originality than any of the GAI chatbots. The most creative humans in this experiment still exceeded GAI originality.
Understanding how GAI chatbots are trained helps us understand why this may be so. The underlying technology is trained on data we humans have produced, and the output is checked by humans as well. So essentially, GAI learns creativity from us combined with the embedded mathematical models designed to churn out unexpected and novel output.
But as AI’s training data expand further and their underlying models become more sophisticated, can they catch up to the most creative humans in terms of originality of ideas? We got a sneak peek of this when the researchers added ChatGPT 4 to their evaluation pool. (It was released while the study was still in progress).
When they ran GPT4 through the Alternative Uses Test and scored its responses, the number of human participants who scored higher in originality than GAI chatbots dropped from 32% to 9%. In fact, another experiment on creativity comparing GPT4 versus humans shows this GAI chatbot outperforming the human participants on average in terms of originality, flexibility and quantity of ideas generated. Even on an individual basis, none of the human participants outscored GPT4 in terms of originality either.
Action Item: As new versions trained on ever-growing databases get released in quick succession with major improvements each time, it will be important for us to stay up-to-date on the latest advancements. The brainstorming process can become more interactive and collaborative as the quality of GAI ideas improve.
The study mentions, “Research has shown that exposure to other people’s creative ideas can stimulate cognitive activity and enhance creativity.” GAI chatbots can stand in for providing ‘other people’s creative ideas’ and stimulating even more novel thinking from us.
So the freelance writer who gets 15 suggestions for an article outline can combine the most interesting AI-generated ideas, add their own creative spin and personal examples, then flesh them out into full drafts. The marketing consultant who gets 30 suggestions for slogans can cherry-pick the most compelling AI-generated ideas to strengthen their pitch deck and messaging. Humans can combine the most promising AI ideas, adding context, expertise, experience, and unique perspectives to come up with final solutions that may not have been obvious from the initial AI suggestions alone.
Chatbots can’t be creative on their own
The notion of human-AI collaboration brings us to the critical distinction. Creativity isn’t just about idea-generation. It is a process that begins with problem-identification, then ideation, assessment of fit of ideas and implementation. While the study results suggest that GAI chatbots can roughly match us in terms of originality and even surpass us in idea generation volume, this only addresses one part of the whole creative process.
The motivation to start a creative process needs to come from humans. As of now, GAI won’t autonomously identify problems in your processes and start thinking about ways to get around them. Neither will it on its own think about how you can inject some fun into your workday. The trigger for everyday creativity still starts with us as we encounter the need to improve or resolve our day to day issues and experiences.
And then we need to implement, test and see if the idea actually works. Does the new process you came up with for making social media content save you time and help you make better posts? Did switching up morning strength training with twice weekly swims give you that extra kick you need to start your day? To a certain extent some AI can test solutions (like software testing), but again this process needs to be supervised by the human in charge.
Action Item: As of today, humans still control and drive the creative process. That is why prompt engineering has become a valuable skill. In order to get the best responses we need to clearly frame the problem, provide context, use specific language and provide examples and feedback. In fact, one of the limitations of the experiment was that the researchers didn’t craft more effective prompts, which could potentially have allowed the chatbots to produce more novel ideas!
As more AI tools and capabilities come to market, the way we interact and collaborate for the whole creative process may likely evolve as well. For example, ChatGPT4o came out just about six months after ChatGPT4 Turbo and this version allows audio input in addition to text and images, with faster response times and an expanded ability to maintain context over longer conversations. Basically this is making interaction with this GAI more intuitive and natural. And as of time of writing, it’s free to use!
So staying on top of advancements in AI and continuously refining our prompting techniques will be crucial, to maximize the assistance they can provide not just in ideation, but in implementation and testing ideas as well.
Limitations
While the study provides interesting insights into the current state of AI creativity on one measure compared to humans, the limitations in sample size and diversity, task scope, methodological choices, and the evolving nature of AI mean the findings cannot be taken as definitive statements about human vs AI creativity in general.
This study only included 100 human participants, which is a relatively small sample size. No demographic information was provided about the participants beyond them being native English speakers from the US with full or part-time work. Factors like age, gender, education level, and occupation were not specified, so it’s unclear how representative the sample is of the broader population.
The study only looked at performance on one type of everyday creative task – the Alternative Uses Test (AUT). While the AUT is a widely used measure of divergent thinking, creativity is a broad and multifaceted construct. Performance on this one task may not fully capture the nuances of human vs AI creativity in other domains or types of creative work.
Moreover, the ‘originality ratings,’ is a highly subjective measure. Cultural beliefs about creativity may influence what is perceived as novel and creative
Lastly, to allow direct comparison with the human responses, the researchers did not use any specialized prompting techniques to elicit the AI responses. However, carefully crafted prompts and follow-ups can significantly improve the quality and relevance of AI-generated creative ideas.
In a similar vein, human participants were paid to undergo this experiment, which may not be the best form of motivation to come up with novel and original ideas, compared to when we are deeply interested in a certain problem or perhaps have expertise in a particular sphere. So the study likely underestimated the full creative potential of both the AI chatbots and humans.
Wrap Up
The study showed us how we stack up against GAI chatbots in one aspect of creativity, and AI is doing pretty well. Taking into account the speed at which ChatGPT4 improved in terms of novelty of ideas versus ChatGPT 3 highlights the vast potential of these applications.
Probably the most relevant question to us now is, how do we incorporate GAI tools in our productivity stack? How can we maximize their capabilities not just in ideation but even in testing and implementation? And how do we make sure that our use of AI is always safe and ethical?
These questions go beyond human-AI collaboration in the creativity sphere. But definitely something to guide our philosophy and approach towards how we use these amazing tools.