Evaluating ChatGPT-4 for the Interpretation of Images from Several Diagnostic Techniques in Gastroenterology
Background: Several artificial intelligence systems based on large language models (LLMs) have been commercially developed, with recent interest in integrating them for clinical questions. Recent versions now include image analysis capacity, but their performance in gastroenterology remains untested. This study assesses ChatGPT-4’s performance in interpreting gastroenterology images. Methods: A total of 740 images from five procedures—capsule endoscopy (CE), device-assisted enteroscopy (DAE), endoscopic ultrasound (EUS), digital single-operator cholangioscopy (DSOC), and high-resolution anoscopy (HRA)—were included and analyzed by ChatGPT-4 using a predefined prompt for each. ChatGPT-4 predictions were compared to gold standard diagnoses. Statistical analyses included accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and area under the curve (AUC). Results: For CE, ChatGPT-4 demonstrated accuracies ranging from 50.0% to 90.0%, with AUCs of 0.50–0.90. For DAE, the model demonstrated an accuracy of 67.0% (AUC 0.670). For EUS, the system showed AUCs of 0.488 and 0.550 for the differentiation between pancreatic cystic and solid lesions, respectively. The LLM differentiated benign from malignant biliary strictures with an AUC of 0.550. For HRA, ChatGPT-4 showed an overall accuracy between 47.5% and 67.5%. Conclusions: ChatGPT-4 demonstrated suboptimal diagnostic accuracies for image interpretation across several gastroenterology techniques, highlighting the need for continuous improvement before clinical adoption.
Top-30
Journals
|
1
|
|
|
Journal of Medical Internet Research
1 publication, 25%
|
|
|
Artificial Intelligence Surgery
1 publication, 25%
|
|
|
Healthcare
1 publication, 25%
|
|
|
World Journal of Gastrointestinal Oncology
1 publication, 25%
|
|
|
1
|
Publishers
|
1
|
|
|
JMIR Publications
1 publication, 25%
|
|
|
OAE Publishing Inc.
1 publication, 25%
|
|
|
MDPI
1 publication, 25%
|
|
|
Baishideng Publishing Group
1 publication, 25%
|
|
|
1
|
- We do not take into account publications without a DOI.
- Statistics recalculated weekly.