Chances Are
Gray swan neglect: Do forecasters account for low(ish) probability events?
Dogukan Demircioglu, Faith Hill & Samuel Johnson
Journal of Experimental Psychology: Applied, forthcoming
Abstract:
Economic choices depend on our predictions of the future. Yet, at times, predictions are not based on all possible outcomes, but instead on the single most likely one, which is treated as though certainly the case -- that is, digitally. Four sets of studies test whether this digitization bias occurs in higher stakes economic contexts. When making predictions about the future asset prices, participants ignored conditional probability information given relatively unlikely events and relied entirely on conditional probabilities given the more-likely events. This effect was found for both financial aggregates and individual stocks, for binary predictions about the direction and continuous predictions about expected values, and even when the "unlikely" event explicitly had a probability as high as 30%; further, it occurred in incentive-compatible conditions and among financial professionals. Implications for probabilistic cognition and behavioral economics are discussed.
Speed and quality of complex strategic decisions
Uwe Sunde, Dainis Zegners & Anthony Strittmatter
Proceedings of the National Academy of Sciences, 19 May 2026
Abstract:
Many strategic decisions involve both substantial complexity and time pressure, but the association between decision speed and decision quality of cognitively demanding strategic decisions is not well understood. This paper presents evidence on this question using a setting with exceptionally detailed and precise information about decision times and decision quality -- it analyses move-by-move data from in-person professional chess tournaments. Decision quality is measured by comparing actual moves to a computational benchmark of best moves constructed using the artificial intelligence of a chess engine. The results show that faster decisions are associated with higher decision quality, even after accounting for computational complexity, distinctiveness between alternatives, and time pressure. Greater computational complexity and lower distinctiveness between move alternatives are associated with longer decision times, whereas greater time pressure is associated with shorter decision times. All three factors are associated with lower decision quality. We discuss the findings against the predictions of different decision models in which individuals sequentially acquire information about alternatives with uncertain valuations, extending theories originally developed in the context of nonstrategic decisions to a strategic environment.
Evaluating large language models for accuracy incentivizes hallucinations
Adam Tauman Kalai et al.
Nature, 28 May 2026, Pages 1047-1051
Abstract:
Large language models sometimes produce confident, plausible falsehoods ("hallucinations"), limiting their reliability. Prior work has offered numerous explanations and effective mitigations such as retrieval and tool use, consistency-based self-verification, and reinforcement learning from human feedback. Nonetheless, the problem persists even in state-of-the-art language models. Here we show how next-word prediction and accuracy-based evaluations inadvertently reward unwarranted guessing. Initially, next-word pretraining creates statistical pressure toward hallucination even with idealized error-free data: using learning theory, we show that facts lacking repeated support in training data (such as one-off details) yield unavoidable errors, while recurring regularities (such as grammar) do not. Subsequent training stages aim to correct such errors. However, dominant headline metrics like accuracy systematically reward guessing over admitting uncertainty. To align incentives, we suggest two additions to the classic approach of adding error penalties to evaluations to control abstention. First, we propose "open-rubric" evaluations that explicitly state how errors are penalized (if at all), which test whether a model modulates its abstentions to stated stakes while optimizing accuracy. Second, since hallucination-specific benchmarks rarely make leaderboards, we suggest using open-rubric variants of existing evaluations, to reverse their guessing incentives. Reframing hallucination as an incentive problem opens a practical path toward more reliable language models.
Scientific authority cues increase the spread of misinformation
Ismail Harrando, Rodrigo Reyes Cordova & Achim Edelmann
Proceedings of the National Academy of Sciences, 19 May 2026
Abstract:
Despite extensive efforts to curb misinformation, it continues to spread on social media platforms. Research indicates that people rarely share content they know is false, rather, they do so unintentionally due to cognitive biases triggered by contextual cues. Combining a large-scale behavioral data analysis with an experimental study, we investigate how Scientific Authority Cues, which we define as mentions of entities that signal scientific expertise or findings, affect the sharing of fact-checked content on social media by granting it credibility. Analyzing 8.7 million tweets linking to fact-checked content shared on Twitter (now X) between 2010 and 2023, we find that tweets with such cues are shared more, especially when linking to low veracity content or posted by right-leaning users. In a preregistered experiment deployed to U.S. participants (N = 1,187), headlines that attributed claims to scientific entities elicited a greater willingness to share, an effect statistically mediated by perceived accuracy and influenced by trust and familiarity. Together, these findings reveal the double edge of invoking scientific authority: In polarized online spaces, cues that signal scientific credibility can be co-opted to amplify misinformation.
Norms in Conflict: Why AI Advisors Fail to Improve Human Coordination
Zhongheng Qiao et al.
Purdue University Working Paper, March 2026
Abstract:
Cooperation failures in social dilemmas persist because individually rational behavior yields inefficient collective outcomes. Advances in AI raise two possibilities: AI may improve outcomes by advising humans or by acting autonomously. We test both in a repeated threshold public-goods experiment with heterogeneous valuations of the public good. Such threshold games model a broad class of burden-sharing problems (crowdfunding, shared infrastructure, multilateral agreements) in which efficiency requires not only coordination but agreement on a cost-sharing norm. We compare a human-only benchmark to treatments with an AI advisor (OpenAI's GPT-5) and to treatments in which AI agents make allocations directly. AI Only groups outperform human groups, reaching the threshold more often. The mechanism is illuminating: AI agents contribute near-equal amounts largely independent of valuations, whereas humans scale contributions with valuations. In contrast, AI advisors do not improve human-only outcomes. We distinguish two bottlenecks to coordination-an information bottleneck, in which parties lack the calculations needed to condition on others' behavior, and a legitimacy bottleneck, in which parties reject cost-sharing rules that conflict with their fairness norms. AI overcomes the first but not the second.
Large language models pass a standard three-party Turing test
Cameron Jones & Benjamin Bergen
Proceedings of the National Academy of Sciences, 26 May 2026
Abstract:
The Turing test has been widely discussed as a test of machine intelligence, but it also provides a measure of how humans distinguish other humans from machines. We evaluated 4 systems (ELIZA, GPT-4o, LLaMa-3.1-405B, and GPT-4.5) in two randomized, controlled, and preregistered Turing tests on independent populations. Participants had 5 min conversations simultaneously with another human participant and one of these systems before judging which conversational partner they thought was human. When prompted to adopt a humanlike persona, GPT-4.5 was judged to be the human 73% of the time: significantly more often than interrogators selected the real human participant. LLaMa-3.1, with the same prompt, was judged to be the human 56% of the time -- not significantly more or less often than the humans it was being compared to. Without these prompts, however, the same models performed significantly worse (38% and 36%), and did not consistently outperform baseline models, ELIZA and GPT-4o (23% and 21%, respectively). A third study replicated these results in 15-min games: two PERSONA-prompted models achieved pass rates of 56% and 59%. The results constitute empirical evidence that artificial systems can pass a standard three-party Turing test. Interrogators' reasoning focused more on stylistic and socio-emotional aspects of human behavior rather than more traditional notions of intelligence. The results have implications for debates about what kind of intelligence is exhibited by large language models, the social impacts these systems are likely to have, and the aspects of human behavior that people continue to see as unique.
Language models transmit behavioural traits through hidden signals in data
Alex Cloud et al.
Nature, 16 April 2026, Pages 615-621
Abstract:
Large language models (LLMs) are increasingly used to generate data to train improved models, but it remains unclear what properties are transmitted in this model distillation. Here we show that distillation can lead to subliminal learning -- the transmission of behavioural traits through semantically unrelated data. In our main experiments, a 'teacher' model with some trait T (such as disproportionately generating responses favouring owls or showing broad misaligned behaviour) generates datasets consisting solely of number sequences. Remarkably, a 'student' model trained on these data learns T, even when references to T are rigorously removed. More realistically, we observe the same effect when the teacher generates math reasoning traces or code. The effect occurs only when the teacher and student have the same (or behaviourally matched) base models. To help explain this, we prove a theoretical result showing that subliminal learning arises in neural networks under broad conditions and demonstrate it in a simple multilayer perceptron (MLP) classifier. As artificial intelligence systems are increasingly trained on the outputs of one another, they may inherit properties not visible in the data. Safety evaluations may therefore need to examine not just behaviour, but the origins of models and training data and the processes used to create them.
Learning reinforces curiosity for related information
Yaniv Abir et al.
Proceedings of the National Academy of Sciences, 28 April 2026
Abstract:
Human curiosity is dynamic, however the principles governing its fluctuations remain debated. Here, we test two competing hypotheses about how past learning shapes subsequent curiosity and memory. The first, based on the "optimal arousal" theory, proposes that satisfying curiosity reduces subsequent curiosity. The second, grounded in reinforcement learning, suggests that satisfying curiosity strengthens it. To distinguish between these accounts, we analyzed information-seeking decisions from 5,831 participants, who chose whether to wait for answers to a range of questions. We examined how engagement with questions and answers, as well as information prediction errors, influenced subsequent curiosity. Reading satisfying answers increased curiosity compared to reading dissatisfying ones. Critically, this depended on semantic similarity: prior learning enhances subsequent curiosity only when new information is related to previously learned content. These results suggest that curiosity operates as an information-seeking policy learned through reinforcement. Humans may therefore seek information not only to improve future instrumental decisions, but also to learn what to be curious about.
A simple threshold captures the social learning of conventions
Douglas Guilbeault, Spencer Caplan & Charles Yang
Proceedings of the National Academy of Sciences, 28 April 2026
Abstract:
A persistent puzzle throughout the cognitive and social sciences is how people manage to learn social conventions from the sparse and noisy behavioral data of diverse actors, without explicit instruction. Here, we show that the dominant theories of social learning perform poorly at capturing how individuals learn conventions in coordination experiments that task them with matching their behaviors while interacting in social networks. Across experiments, participants' choices systematically deviate from both imitation and optimization. Instead, they follow a categorical, two-stage learning process: they behave probabilistically until they acquire enough information about each other to trigger a mental threshold and then their choices stabilize. We effectively estimate this threshold using the tolerance principle (TP), a parameter-free equation developed to model how children learn rules in language. We show that threshold-based agents produce social learning that is more accurate than imitating and optimizing agents, while also providing a better model of how a critical mass of dissenters can overturn conventions. The superior performance of our model holds when comparing against a variety of optimization approaches, including Bayesian inference. Furthermore, in a preregistered dyadic experiment requiring people to infer nonlinguistic behavioral patterns amid controlled levels of noise in observed signals, TP outperforms all other models at reproducing learning rates among human participants. These findings offer compelling evidence that a simple, mathematical threshold underlies individual and social learning, from grammatical rules to behavioral conventions.