4 min read

5-Min Brief: A Study of 100,000 People Found AI Is Now More Creative Than the Average Human. Here's the Catch.

The largest creativity study ever found AI outperforms the average human on creativity tests. But the top 10% of humans still win — and the full picture is more interesting than the headline.
5-Min Brief: A Study of 100,000 People Found AI Is Now More Creative Than the Average Human. Here's the Catch.

What you need to know — in 30 seconds

  • Researchers at the Université de Montréal published the largest ever comparative study of human and AI creativity — 100,000 human participants tested against nine AI models
  • The result: AI now outperforms the average human on standardized creativity tests
  • The catch: the top 10% of human creators still leave AI well behind — especially in poetry, storytelling, and richer creative work
  • The study was co-authored by Yoshua Bengio — one of the three researchers who essentially invented modern deep learning

Creativity is the thing people most reliably point to when asked what AI can't do. Not facts, not logic, not code — those are learnable. But genuine creative originality? The ability to make unexpected connections, to produce something that surprises even the person who made it? That's supposed to be ours.

A study published in January in Scientific Reports — one that's been getting renewed attention this week — puts a dent in that assumption. Not a fatal one. But a real one.

What the study actually did

The research team — led by Professor Karim Jerbi at the Université de Montréal, with contributions from Google DeepMind and co-authorship from deep learning pioneer Yoshua Bengio — set out to do something nobody had done before at this scale: rigorously test AI creativity against a massive sample of humans using a standardized psychological tool.

The primary instrument was something called the Divergent Association Task, or DAT. The test is simple: generate ten words that are as semantically unrelated to each other as possible. "Cloud, hammer, justice, pencil, ocean" — that kind of thing. The more conceptually distant your words, the higher your creativity score.

Divergent thinking — the ability to range widely across unconnected concepts — is considered a core component of creative cognition. It's what underlies brainstorming, metaphor-making, and the kind of lateral thinking that produces genuinely novel ideas.

The team tested GPT-4, ChatGPT, Claude, Gemini, and five other models. They collected 500 responses from each. They ran the same test on 100,000 human participants from the US, UK, Canada, Australia, and New Zealand — balanced for age and gender. Then they compared the results.

What they found

AI won. Specifically, GPT-4 and several other models consistently scored above the average human on the DAT. The finding was robust across different conditions — not a statistical fluke, not an edge case.

The researchers also tested more complex creative tasks — writing haikus, movie synopses, and short fiction. The same basic pattern held for idea generation and brainstorming tasks. AI generated more semantically diverse responses than the typical human.

One detail worth noting because it cuts against a common assumption: bigger models weren't always more creative. Vicuna, a smaller open-source model, outscored several larger, more expensive commercial alternatives on certain creativity measures. Raw scale doesn't automatically produce creative range.

The catch — and it's a meaningful one

Here's what the headline misses: the top 10% of human creators still beat every AI model tested — by a clear margin.

Not slightly. Clearly. Particularly on poetry, plot summaries, and work requiring genuine originality — the kind of creativity that involves voice, lived experience, emotional specificity, and surprise.

The study's finding is more precisely stated as: AI now beats the average human, not the best humans. The distribution matters. Most people aren't highly creative on demand. When you're comparing against the full population, AI looks strong. When you're comparing against the people whose job is to be creative — writers, poets, artists, musicians — the picture changes.

Professor Jerbi framed it carefully: "Our study shows that some AI systems based on large language models can now outperform average human creativity on well-defined tasks. This result may be surprising — even unsettling — but our study also highlights an equally important observation: even the best AI systems still fall short of the levels reached by the most creative humans."

Why this result makes sense if you've been following this series

If you've read our deep dives on how ChatGPT works and what machine learning actually is, the study's findings make intuitive sense.

AI models are trained on essentially everything humans have ever written — every book, article, poem, forum post, and story available on the internet. When asked to generate ten unrelated words, the model is drawing on that vast exposure to human language and concept-space, finding statistically distant associations across billions of learned relationships.

That's a different cognitive process than human creativity — but it produces similar outputs on certain measures. The model isn't daydreaming or drawing on lived experience. It's sampling across an enormous probability distribution of human ideas.

Which is also why it hits a ceiling. The most creative humans aren't just drawing on existing ideas in new combinations — they're producing genuinely novel ones rooted in specific experiences, emotions, and observations that no training dataset fully captures. That gap remains real.

What this means practically

A few honest takeaways:

For people who use AI for creative work: this study validates what many already experience — AI is a genuinely useful creative collaborator for brainstorming, ideation, and generating options. Treating it as a creative assistant rather than a replacement is both accurate and practical.

For people worried about creative jobs: the study's nuance matters here. The roles most exposed are those requiring volume, variety, and speed in idea generation — certain types of copywriting, content generation, brainstorming facilitation. The roles least exposed are those where the human voice, perspective, and lived experience are the actual product. The difference between "generate ten tagline options" and "write a poem that actually says something true" remains significant.

For anyone who assumed creativity was safely human: the assumption needs updating. Not abandoned — updated. Average human creativity on certain well-defined tasks is no longer a safe ceiling. The interesting question now is what creativity actually means when AI can do the measurable parts of it.

The line worth sitting with

The study's co-author Yoshua Bengio — one of three researchers who essentially invented the deep learning technology that makes all of this possible — contributed to a paper showing that the systems he helped create have now crossed an average human threshold for creative thought.

Whether that strikes you as triumphant, unsettling, or simply interesting probably tells you something about where you stand in this moment.

HumanReadable-AI covers AI news in plain English every weekday. Subscribe below — free, no jargon, always under five minutes.