Open Access
Open access
volume 20 issue 2 pages e0309364

Image captioning in Bengali language using visual attention

Publication typeJournal Article
Publication date2025-02-13
scimago Q1
wos Q2
SJR0.803
CiteScore5.4
Impact factor2.6
ISSN19326203
Abstract

Automatically generating image captions poses one of the most challenging applications within artificial intelligence due to its integration of computer vision and natural language processing algorithms. This task becomes notably more formidable when dealing with a language as intricate as Bengali and the overall scarcity of Bengali-captioned image databases. In this investigation, a meticulously human-annotated dataset of Bengali captions has been curated specifically for the encompassing collection of pictures. Simultaneously, an innovative end-to-end architecture has been introduced to craft pertinent image descriptions in the Bengali language, leveraging an attention-driven decoder. Initially, the amalgamation of images’ spatial and temporal attributes is facilitated by Gated Recurrent Units, constituting the input features. These features are subsequently fed into the attention layer alongside embedded caption features. The attention mechanism scrutinizes the interrelation between visual and linguistic representations, encompassing both categories of representations. Later, a comprehensive recursive unit comprising two layers employs the amalgamated attention traits to construct coherent sentences. Utilizing our furnished dataset, this model undergoes training, culminating in achievements of a 43% BLEU-4 score, a 39% METEOR score, and a 47% ROUGE score. Compared to all preceding endeavors in Bengali image captioning, these outcomes signify the pinnacle of current attainable standards.

Found 
Found 

Top-30

Journals

1
Machine Learning with Applications
1 publication, 33.33%
Results in Engineering
1 publication, 33.33%
npj Heritage Science
1 publication, 33.33%
1

Publishers

1
2
Elsevier
2 publications, 66.67%
Springer Nature
1 publication, 33.33%
1
2
  • We do not take into account publications without a DOI.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
3
Share
Cite this
GOST |
Cite this
GOST Copy
Masud A. et al. Image captioning in Bengali language using visual attention // PLoS ONE. 2025. Vol. 20. No. 2. p. e0309364.
GOST all authors (up to 50) Copy
Masud A., Hosen M. B., Habibullah M., Anannya M., Kaiser M. S. Image captioning in Bengali language using visual attention // PLoS ONE. 2025. Vol. 20. No. 2. p. e0309364.
RIS |
Cite this
RIS Copy
TY - JOUR
DO - 10.1371/journal.pone.0309364
UR - https://dx.plos.org/10.1371/journal.pone.0309364
TI - Image captioning in Bengali language using visual attention
T2 - PLoS ONE
AU - Masud, Adiba
AU - Hosen, Md Biplob
AU - Habibullah, Md.
AU - Anannya, Mehrin
AU - Kaiser, M. Shamim
PY - 2025
DA - 2025/02/13
PB - Public Library of Science (PLoS)
SP - e0309364
IS - 2
VL - 20
SN - 1932-6203
ER -
BibTex |
Cite this
BibTex (up to 50 authors) Copy
@article{2025_Masud,
author = {Adiba Masud and Md Biplob Hosen and Md. Habibullah and Mehrin Anannya and M. Shamim Kaiser},
title = {Image captioning in Bengali language using visual attention},
journal = {PLoS ONE},
year = {2025},
volume = {20},
publisher = {Public Library of Science (PLoS)},
month = {feb},
url = {https://dx.plos.org/10.1371/journal.pone.0309364},
number = {2},
pages = {e0309364},
doi = {10.1371/journal.pone.0309364}
}
MLA
Cite this
MLA Copy
Masud, Adiba, et al. “Image captioning in Bengali language using visual attention.” PLoS ONE, vol. 20, no. 2, Feb. 2025, p. e0309364. https://dx.plos.org/10.1371/journal.pone.0309364.