A Multimodal Biomedical Foundation Model Trained from Fifteen Million Image–Text Pairs
Тип публикации: Journal Article
Дата публикации: 2025-01-01
Краткое описание
AbstractBackgroundBiomedical data are inherently multimodal, comprising physical measurements and natural-language narratives. A generalist biomedical artificial intelligence (AI) model needs to simultaneously process different modalities of data, including text and images. Therefore, training an effective generalist biomedical model requires high-quality multimodal data, such as parallel image–text pairs.MethodsHere, we present PMC-15M, a novel dataset that is two orders of magnitude larger than existing biomedical multimodal datasets, such as MIMIC-CXR, and spans a diverse range of biomedical image types. PMC-15M contains 15 million biomedical image–text pairs collected from 4.4 million scientific articles. Based on PMC-15M, we have pretrained BiomedCLIP, a multimodal foundation model, with domain-specific adaptations tailored to biomedical vision–language processing.ResultsWe conducted extensive experiments and ablation studies on standard biomedical imaging tasks from retrieval to classification to visual question answering (VQA). BiomedCLIP achieved new state-of-the-art results in a wide range of standard datasets, substantially outperforming prior approaches. Intriguingly, by large-scale pretraining on diverse biomedical image types, BiomedCLIP even outperforms state-of-the-art radiology-specific models, such as BioViL, in radiology-specific tasks such as Radiological Society of North America (RSNA) pneumonia detection.ConclusionsBiomedCLIP is a fully open-access foundation model that achieves state-of-the-art performance on various biomedical tasks, paving the way for transformative multimodal biomedical discovery and applications. We release our models at aka.ms/biomedclip to facilitate future research in multimodal biomedical AI.
Найдено
Ничего не найдено, попробуйте изменить настройки фильтра.
Для доступа к списку цитирований публикации необходимо авторизоваться.
Топ-30
Журналы
|
2
4
6
8
10
12
14
16
18
20
|
|
|
Lecture Notes in Computer Science
19 публикаций, 12.58%
|
|
|
medRxiv
9 публикаций, 5.96%
|
|
|
npj Digital Medicine
6 публикаций, 3.97%
|
|
|
IEEE Journal of Biomedical and Health Informatics
5 публикаций, 3.31%
|
|
|
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
5 публикаций, 3.31%
|
|
|
Nature Biomedical Engineering
5 публикаций, 3.31%
|
|
|
International Conference on Computer Vision Workshops (ICCV Workshops)
4 публикации, 2.65%
|
|
|
Medical Image Analysis
3 публикации, 1.99%
|
|
|
Information Fusion
3 публикации, 1.99%
|
|
|
Nature Communications
2 публикации, 1.32%
|
|
|
Expert Systems with Applications
2 публикации, 1.32%
|
|
|
Nature Medicine
2 публикации, 1.32%
|
|
|
IEEE Transactions on Medical Imaging
2 публикации, 1.32%
|
|
|
IEEE International Joint Conference on Neural Networks (IJCNN)
2 публикации, 1.32%
|
|
|
Journal of Imaging Informatics in Medicine
2 публикации, 1.32%
|
|
|
Nature
2 публикации, 1.32%
|
|
|
Computerized Medical Imaging and Graphics
2 публикации, 1.32%
|
|
|
PLoS ONE
2 публикации, 1.32%
|
|
|
Journal of Cardiology
1 публикация, 0.66%
|
|
|
bioRxiv
1 публикация, 0.66%
|
|
|
IEEE Workshop on Applications of Computer Vision (WACV)
1 публикация, 0.66%
|
|
|
Applied Sciences (Switzerland)
1 публикация, 0.66%
|
|
|
Physics of Fluids
1 публикация, 0.66%
|
|
|
Journal of Clinical Medicine
1 публикация, 0.66%
|
|
|
Mayo Clinic Proceedings Digital Health
1 публикация, 0.66%
|
|
|
Multimedia Systems
1 публикация, 0.66%
|
|
|
BMC Artificial Intelligence
1 публикация, 0.66%
|
|
|
NEJM AI
1 публикация, 0.66%
|
|
|
Information Processing and Management
1 публикация, 0.66%
|
|
|
2
4
6
8
10
12
14
16
18
20
|
Издатели
|
5
10
15
20
25
30
35
40
45
50
|
|
|
Springer Nature
48 публикаций, 31.79%
|
|
|
Institute of Electrical and Electronics Engineers (IEEE)
47 публикаций, 31.13%
|
|
|
Elsevier
24 публикации, 15.89%
|
|
|
openRxiv
10 публикаций, 6.62%
|
|
|
MDPI
6 публикаций, 3.97%
|
|
|
Association for Computing Machinery (ACM)
5 публикаций, 3.31%
|
|
|
Frontiers Media S.A.
2 публикации, 1.32%
|
|
|
Public Library of Science (PLoS)
2 публикации, 1.32%
|
|
|
AIP Publishing
1 публикация, 0.66%
|
|
|
Massachusetts Medical Society
1 публикация, 0.66%
|
|
|
IOP Publishing
1 публикация, 0.66%
|
|
|
Wiley
1 публикация, 0.66%
|
|
|
Baishideng Publishing Group
1 публикация, 0.66%
|
|
|
Oxford University Press
1 публикация, 0.66%
|
|
|
American Association for the Advancement of Science (AAAS)
1 публикация, 0.66%
|
|
|
5
10
15
20
25
30
35
40
45
50
|
- Мы не учитываем публикации, у которых нет DOI.
- Статистика публикаций обновляется еженедельно.
Вы ученый?
Создайте профиль, чтобы получать персональные рекомендации коллег, конференций и новых статей.
Войти с ORCID
Метрики
158
Всего цитирований:
158
Цитирований c 2025:
146
(96.68%)
Цитировать
ГОСТ |
RIS |
BibTex
Цитировать
ГОСТ
Скопировать
Zhang S. et al. A Multimodal Biomedical Foundation Model Trained from Fifteen Million Image–Text Pairs // NEJM AI. 2025. Vol. 2. No. 1.
ГОСТ со всеми авторами (до 50)
Скопировать
Zhang S., Xu Y., Usuyama N., Xu H., Bagga J., Tinn R., Preston S., Rao R., Wei M., Valluri N., Wong C., Tupini A., Wang Yu., Mazzola M., Shukla S., Liden L., Gao J., Crabtree A., Piening B., Bifulco C. B., Lungren M. P., Naumann T., Wang S., Poon H. A Multimodal Biomedical Foundation Model Trained from Fifteen Million Image–Text Pairs // NEJM AI. 2025. Vol. 2. No. 1.
Цитировать
RIS
Скопировать
TY - JOUR
DO - 10.1056/aioa2400640
UR - https://ai.nejm.org/doi/10.1056/AIoa2400640
TI - A Multimodal Biomedical Foundation Model Trained from Fifteen Million Image–Text Pairs
T2 - NEJM AI
AU - Zhang, Sheng
AU - Xu, Yanbo
AU - Usuyama, Naoto
AU - Xu, Hanwen
AU - Bagga, Jaspreet
AU - Tinn, Robert
AU - Preston, Sam
AU - Rao, Rajesh
AU - Wei, Mu
AU - Valluri, Naveen
AU - Wong, C.
AU - Tupini, Andrea
AU - Wang, Yu
AU - Mazzola, Matt
AU - Shukla, Swadheen
AU - Liden, Lars
AU - Gao, Jianfeng
AU - Crabtree, Angela
AU - Piening, Brian
AU - Bifulco, Carlo B.
AU - Lungren, Matthew P.
AU - Naumann, Tristan
AU - Wang, Sheng
AU - Poon, Hoifung
PY - 2025
DA - 2025/01/01
PB - Massachusetts Medical Society
IS - 1
VL - 2
SN - 2836-9386
ER -
Цитировать
BibTex (до 50 авторов)
Скопировать
@article{2025_Zhang,
author = {Sheng Zhang and Yanbo Xu and Naoto Usuyama and Hanwen Xu and Jaspreet Bagga and Robert Tinn and Sam Preston and Rajesh Rao and Mu Wei and Naveen Valluri and C. Wong and Andrea Tupini and Yu Wang and Matt Mazzola and Swadheen Shukla and Lars Liden and Jianfeng Gao and Angela Crabtree and Brian Piening and Carlo B. Bifulco and Matthew P. Lungren and Tristan Naumann and Sheng Wang and Hoifung Poon},
title = {A Multimodal Biomedical Foundation Model Trained from Fifteen Million Image–Text Pairs},
journal = {NEJM AI},
year = {2025},
volume = {2},
publisher = {Massachusetts Medical Society},
month = {jan},
url = {https://ai.nejm.org/doi/10.1056/AIoa2400640},
number = {1},
doi = {10.1056/aioa2400640}
}
Ошибка в публикации?