Open Access
Open access
том 12 издание 4

Generalization bias in large language model summarization of scientific research

Тип публикацииJournal Article
Дата публикации2025-04-01
scimago Q1
wos Q2
БС1
SJR0.795
CiteScore5.3
Impact factor2.9
ISSN20545703
Краткое описание

Artificial intelligence chatbots driven by large language models (LLMs) have the potential to increase public science literacy and support scientific research, as they can quickly summarize complex scientific information in accessible terms. However, when summarizing scientific texts, LLMs may omit details that limit the scope of research conclusions, leading to generalizations of results broader than warranted by the original study. We tested 10 prominent LLMs, including ChatGPT-4o, ChatGPT-4.5, DeepSeek, LLaMA 3.3 70B, and Claude 3.7 Sonnet, comparing 4900 LLM-generated summaries to their original scientific texts. Even when explicitly prompted for accuracy, most LLMs produced broader generalizations of scientific results than those in the original texts, with DeepSeek, ChatGPT-4o, and LLaMA 3.3 70B overgeneralizing in 26–73% of cases. In a direct comparison of LLM-generated and human-authored science summaries, LLM summaries were nearly five times more likely to contain broad generalizations (odds ratio = 4.85, 95% CI [3.06, 7.70], p < 0.001). Notably, newer models tended to perform worse in generalization accuracy than earlier ones. Our results indicate a strong bias in many widely used LLMs towards overgeneralizing scientific conclusions, posing a significant risk of large-scale misinterpretations of research findings. We highlight potential mitigation strategies, including lowering LLM temperature settings and benchmarking LLMs for generalization accuracy.

Найдено 
Найдено 

Вы ученый?

Создайте профиль, чтобы получать персональные рекомендации коллег, конференций и новых статей.
Метрики
19
Поделиться
Цитировать
ГОСТ |
Цитировать
Peters U. H., Chin-Yee B. Generalization bias in large language model summarization of scientific research // Royal Society Open Science. 2025. Vol. 12. No. 4.
ГОСТ со всеми авторами (до 50) Скопировать
Peters U. H., Chin-Yee B. Generalization bias in large language model summarization of scientific research // Royal Society Open Science. 2025. Vol. 12. No. 4.
RIS |
Цитировать
TY - JOUR
DO - 10.1098/rsos.241776
UR - https://royalsocietypublishing.org/doi/10.1098/rsos.241776
TI - Generalization bias in large language model summarization of scientific research
T2 - Royal Society Open Science
AU - Peters, Uwe H.
AU - Chin-Yee, Benjamin
PY - 2025
DA - 2025/04/01
PB - The Royal Society
IS - 4
VL - 12
SN - 2054-5703
ER -
BibTex
Цитировать
BibTex (до 50 авторов) Скопировать
@article{2025_Peters,
author = {Uwe H. Peters and Benjamin Chin-Yee},
title = {Generalization bias in large language model summarization of scientific research},
journal = {Royal Society Open Science},
year = {2025},
volume = {12},
publisher = {The Royal Society},
month = {apr},
url = {https://royalsocietypublishing.org/doi/10.1098/rsos.241776},
number = {4},
doi = {10.1098/rsos.241776}
}