IDLab Seminar: Why Humans Still Outperform AI in Strategic Games
On March 6, 2026, the International Laboratory of Intangible-driven Economy (IDLab) held its regular scientific seminar. Dmitry Dagaev, Head of the Laboratory of Sports Studies (LIS) and Senior Research Fellow at IDLab, presented a report titled “Not Yet: Humans Outperform LLMs in a Colonel Blotto Tournament.” The study was co-authored by Egor Ivanov (Junior Research Fellow at IDLab), Petr Parshakov (Head of IDLab), Alexey Savvateev (mathematician and science communicator), and Gleb Vasiliev (Junior Research Fellow at LIS). The research focuses on a classical game-theoretic problem—the Colonel Blotto game—within the context of the modern competition between humans and Large Language Models (LLMs).The study is supported by the Russian Science Foundation (RSF) project 25-18-00539, “Comparative Analysis of AI-Based Agents and Real Individuals in Economic Decision-Making.”
.jpg)
The relevance of the research is driven by the fact that in many fields, artificial intelligence already performs at a level comparable to or even exceeding that of humans. The researchers sought to determine whether this superiority persists in situations involving complex strategic interaction where no single optimal winning algorithm exists. The focus was on AI's capacity for deep reasoning in environments where success depends on the ability to anticipate an opponent's logic and effectively allocate limited resources.
To collect data, the authors conducted a series of round-robin tournaments in which participants competed against one another using strategies disguised as a political election campaign. Each player was required to distribute 100 resource units across nine states to outperform their opponent in the majority of them (more details about the tournament can be found here: https://idlab.hse.ru/news/1129828404.html). The researchers compared the approaches of over 200 individuals with several popular large language models (LLMs). The results showed that humans consistently outperform AI. The researchers hypothesized that this might be due to the use of so-called "k-level reasoning" — the human ability to calculate an opponent's logic several steps ahead. Empirical testing supported this hypothesis: while the models often chose a simple uniform distribution or made calculation errors, humans employed flexible multi-level strategies, deliberately sacrificing some areas to secure victory in others.
The analysis also revealed that the most successful human participants were those with a background in STEM (Science, Technology, Engineering, and Mathematics). An intriguing fact also emerged: humans showed almost no adaptation of their strategies to a specific opponent. Despite the fact that changing the pool of participants (from humans to neural networks) theoretically requires a revision of strategic choices, players did not change their resource allocation logic. This suggests that participants relied primarily on the rules of the game rather than attempts to predict the actions of a specific opponent, effectively perceiving the AI in the same way they would a human competitor.
The seminar concluded with a discussion where participants explored ways to improve AI performance through prompt engineering or by assigning specific social roles to the models. These suggestions are viewed as promising avenues for potentially testing the robustness of the findings in future research.
