Generative Models in Medical Visual Question Answering: A Survey

Wenjie Dong ¹

Shuhao Shen ¹

Yuqiang Han ¹

Tao Tan ²

Wu Jian ¹

Hongxia Xu ¹

Hide authors affiliations

College of Computer Science & Technology and Institute of Wenzhou, Zhejiang University, Hangzhou 310012, China |

Faculty of Applied Sciences, Macao Polytechnic University, Macao 999078, China |

Publication type: Journal Article

Publication date: 2025-03-10

MDPI

Journal: Applied Sciences (Switzerland)

scimago Q2

SJR: 0.508

CiteScore: 5.3

Impact factor: 2.5

ISSN: 20763417

DOI: 10.3390/app15062983

Copy DOI

Abstract

Medical Visual Question Answering (MedVQA) is a crucial intersection of artificial intelligence and healthcare. It enables systems to interpret medical images—such as X-rays, MRIs, and pathology slides—and respond to clinical queries. Early approaches primarily relied on discriminative models, which select answers from predefined candidates. However, these methods struggle to effectively address open-ended, domain-specific, or complex queries. Recent advancements have shifted the focus toward generative models, leveraging autoregressive decoders, large language models (LLMs), and multimodal large language models (MLLMs) to generate more nuanced and free-form answers. This review comprehensively examines the paradigm shift from discriminative to generative systems, examining generative MedVQA works on their model architectures and training process, summarizing evaluation benchmarks and metrics, highlighting key advances and techniques that propels the development of generative MedVQA, such as concept alignment, instruction tuning, and parameter-efficient fine-tuning (PEFT), alongside strategies for data augmentation and automated dataset creation. Finally, we propose future directions to enhance clinical reasoning and intepretability, build robust evaluation benchmarks and metrics, and employ scalable training strategies and deployment solutions. By analyzing the strengths and limitations of existing generative MedVQA approaches, we aim to provide valuable insights for researchers and practitioners working in this domain.

Found

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.

Publication PDF

Metrics

Cite this

GOST | RIS | BibTex | MLA

Found error?

Publisher

MDPI

Journal

Applied Sciences (Switzerland)

scimago Q2

SJR

0.508

CiteScore

5.3

Impact factor

2.5

ISSN

20763417 (Electronic)