Event Knowledge in Large Language Models: The Gap Between the Impossible and the Unlikely-Reference-Cited by-同舟云学术

Event Knowledge in Large Language Models: The Gap Between the Impossible and the Unlikely

Published:2023-11 Issue:11 Volume:47 Page:
ISSN:0364-0213
Container-title:Cognitive Science
language:en
Short-container-title:Cognitive Science

Author:

Kauf Carina¹²,Ivanova Anna A.¹²³,Rambelli Giulia⁴,Chersoni Emmanuele⁵,She Jingyuan Selena¹²,Chowdhury Zawad⁶,Fedorenko Evelina¹²,Lenci Alessandro⁷

Affiliation:

1. Department of Brain and Cognitive Sciences Massachusetts Institute of Technology

2. McGovern Institute for Brain Research, Massachusetts Institute of Technology

3. Computer Science and Artificial Intelligence Lab Massachusetts Institute of Technology

4. Department of Modern Languages, Literatures and Cultures University of Bologna

5. Department of Chinese and Bilingual Studies Hong Kong Polytechnic University

6. Department of Mathematics University of Washington

7. Department of Philology, Literature, and Linguistics University of Pisa

Abstract

AbstractWord co‐occurrence patterns in language corpora contain a surprising amount of conceptual knowledge. Large language models (LLMs), trained to predict words in context, leverage these patterns to achieve impressive performance on diverse semantic tasks requiring world knowledge. An important but understudied question about LLMs’ semantic abilities is whether they acquire generalized knowledge of common events. Here, we test whether five pretrained LLMs (from 2018's BERT to 2023's MPT) assign a higher likelihood to plausible descriptions of agent−patient interactions than to minimally different implausible versions of the same event. Using three curated sets of minimal sentence pairs (total n = 1215), we found that pretrained LLMs possess substantial event knowledge, outperforming other distributional language models. In particular, they almost always assign a higher likelihood to possible versus impossible events (The teacher bought the laptop vs. The laptop bought the teacher). However, LLMs show less consistent preferences for likely versus unlikely events (The nanny tutored the boy vs. The boy tutored the nanny). In follow‐up analyses, we show that (i) LLM scores are driven by both plausibility and surface‐level sentence features, (ii) LLM scores generalize well across syntactic variants (active vs. passive constructions) but less well across semantic variants (synonymous sentences), (iii) some LLM errors mirror human judgment ambiguity, and (iv) sentence plausibility serves as an organizing dimension in internal LLM representations. Overall, our results show that important aspects of event knowledge naturally emerge from distributional linguistic patterns, but also highlight a gap between representations of possible/impossible and likely/unlikely events.

Funder

European Commission

Publisher

Wiley

Subject

Artificial Intelligence,Cognitive Neuroscience,Experimental and Cognitive Psychology

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1111/cogs.13386

Reference172 articles.

1. Abdou M. Kulmizev A. Hershcovich D. Frank S. Pavlick E. &Søgaard A.(2021).Can language models encode perceptual structure without grounding? A case study in color. InProceedings of the 25th Conference on Computational Natural Language Learning(pp.109–132).

2. Incremental interpretation at verbs: restricting the domain of subsequent reference

3. Atari M. Xue M. J. Park P. S. Blasi D. &Henrich J.(2023).Which humans?https://doi.org/10.31234/osf.io/5b26t

4. Bates D. Mächler M. Bolker B. &Walker S.(2014).Fitting linear mixed‐effects models using lme4.arXiv Preprint arXiv:1406.5823.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Driving and suppressing the human language network using large language models;Nature Human Behaviour;2024-01-03

2. Meaning creation in novel noun-noun compounds: humans and language models;Language, Cognition and Neuroscience;2023-09-11