Utilizing ChatGPT as a scientific reasoning engine to differentiate conflicting evidence and summarize challenges in controversial clinical questions

Shiyao Xie 1, 2
Wenjing Zhao 1, 2
Guanghui Deng 3
GUOHUA HE 4
Na He 5
Zhenhua Lu 6
WEIHUA HU 7
Mingming Zhao 8
Jian Du 1, 2
Publication typeJournal Article
Publication date2024-05-17
scimago Q1
wos Q1
SJR2.039
CiteScore11.1
Impact factor4.6
ISSN10675027, 1527974X
Abstract
Objective

Synthesizing and evaluating inconsistent medical evidence is essential in evidence-based medicine. This study aimed to employ ChatGPT as a sophisticated scientific reasoning engine to identify conflicting clinical evidence and summarize unresolved questions to inform further research.

Materials and Methods

We evaluated ChatGPT’s effectiveness in identifying conflicting evidence and investigated its principles of logical reasoning. An automated framework was developed to generate a PubMed dataset focused on controversial clinical topics. ChatGPT analyzed this dataset to identify consensus and controversy, and to formulate unsolved research questions. Expert evaluations were conducted 1) on the consensus and controversy for factual consistency, comprehensiveness, and potential harm and, 2) on the research questions for relevance, innovation, clarity, and specificity.

Results

The gpt-4-1106-preview model achieved a 90% recall rate in detecting inconsistent claim pairs within a ternary assertions setup. Notably, without explicit reasoning prompts, ChatGPT provided sound reasoning for the assertions between claims and hypotheses, based on an analysis grounded in relevance, specificity, and certainty. ChatGPT’s conclusions of consensus and controversies in clinical literature were comprehensive and factually consistent. The research questions proposed by ChatGPT received high expert ratings.

Discussion

Our experiment implies that, in evaluating the relationship between evidence and claims, ChatGPT considered more detailed information beyond a straightforward assessment of sentimental orientation. This ability to process intricate information and conduct scientific reasoning regarding sentiment is noteworthy, particularly as this pattern emerged without explicit guidance or directives in prompts, highlighting ChatGPT’s inherent logical reasoning capabilities.

Conclusion

This study demonstrated ChatGPT’s capacity to evaluate and interpret scientific claims. Such proficiency can be generalized to broader clinical research literature. ChatGPT effectively aids in facilitating clinical studies by proposing unresolved challenges based on analysis of existing studies. However, caution is advised as ChatGPT’s outputs are inferences drawn from the input literature and could be harmful to clinical practice.

Found 
Found 

Top-30

Journals

1
Rheumatology International
1 publication, 9.09%
Journal of NeuroInterventional Surgery
1 publication, 9.09%
Journal of Innovation and Knowledge
1 publication, 9.09%
Bioengineering
1 publication, 9.09%
Scientific Reports
1 publication, 9.09%
Exploratory Research in Clinical and Social Pharmacy
1 publication, 9.09%
AI
1 publication, 9.09%
BMC Medicine
1 publication, 9.09%
npj Health Systems
1 publication, 9.09%
Indian Journal of Orthopaedics
1 publication, 9.09%
1

Publishers

1
2
3
4
Springer Nature
4 publications, 36.36%
Elsevier
2 publications, 18.18%
MDPI
2 publications, 18.18%
BMJ
1 publication, 9.09%
Institute of Electrical and Electronics Engineers (IEEE)
1 publication, 9.09%
Ovid Technologies (Wolters Kluwer Health)
1 publication, 9.09%
1
2
3
4
  • We do not take into account publications without a DOI.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
11
Share
Cite this
GOST |
Cite this
GOST Copy
Xie S. et al. Utilizing ChatGPT as a scientific reasoning engine to differentiate conflicting evidence and summarize challenges in controversial clinical questions // Journal of the American Medical Informatics Association : JAMIA. 2024. Vol. 31. No. 7.
GOST all authors (up to 50) Copy
Xie S., Zhao W., Deng G., HE G., He N., Lu Z., HU W., Zhao M., Du J. Utilizing ChatGPT as a scientific reasoning engine to differentiate conflicting evidence and summarize challenges in controversial clinical questions // Journal of the American Medical Informatics Association : JAMIA. 2024. Vol. 31. No. 7.
RIS |
Cite this
RIS Copy
TY - JOUR
DO - 10.1093/jamia/ocae100
UR - https://doi.org/10.1093/jamia/ocae100
TI - Utilizing ChatGPT as a scientific reasoning engine to differentiate conflicting evidence and summarize challenges in controversial clinical questions
T2 - Journal of the American Medical Informatics Association : JAMIA
AU - Xie, Shiyao
AU - Zhao, Wenjing
AU - Deng, Guanghui
AU - HE, GUOHUA
AU - He, Na
AU - Lu, Zhenhua
AU - HU, WEIHUA
AU - Zhao, Mingming
AU - Du, Jian
PY - 2024
DA - 2024/05/17
PB - Oxford University Press
IS - 7
VL - 31
PMID - 38758667
SN - 1067-5027
SN - 1527-974X
ER -
BibTex
Cite this
BibTex (up to 50 authors) Copy
@article{2024_Xie,
author = {Shiyao Xie and Wenjing Zhao and Guanghui Deng and GUOHUA HE and Na He and Zhenhua Lu and WEIHUA HU and Mingming Zhao and Jian Du},
title = {Utilizing ChatGPT as a scientific reasoning engine to differentiate conflicting evidence and summarize challenges in controversial clinical questions},
journal = {Journal of the American Medical Informatics Association : JAMIA},
year = {2024},
volume = {31},
publisher = {Oxford University Press},
month = {may},
url = {https://doi.org/10.1093/jamia/ocae100},
number = {7},
doi = {10.1093/jamia/ocae100}
}