Survey of Hallucination in Natural Language Generation
Natural Language Generation (NLG) has improved exponentially in recent years thanks to the development of sequence-to-sequence deep learning technologies such as Transformer-based language models. This advancement has led to more fluent and coherent NLG, leading to improved development in downstream tasks such as abstractive summarization, dialogue generation, and data-to-text generation. However, it is also apparent that deep learning based generation is prone to hallucinate unintended text, which degrades the system performance and fails to meet user expectations in many real-world scenarios. To address this issue, many studies have been presented in measuring and mitigating hallucinated texts, but these have never been reviewed in a comprehensive manner before.
In this survey, we thus provide a broad overview of the research progress and challenges in the hallucination problem in NLG. The survey is organized into two parts: (1) a general overview of metrics, mitigation methods, and future directions, and (2) an overview of task-specific research progress on hallucinations in the following downstream tasks, namely abstractive summarization, dialogue generation, generative question answering, data-to-text generation, and machine translation. This survey serves to facilitate collaborative efforts among researchers in tackling the challenge of hallucinated texts in NLG.
Top-30
Journals
|
20
40
60
80
100
120
140
|
|
|
Lecture Notes in Computer Science
124 publications, 6.54%
|
|
|
Communications in Computer and Information Science
33 publications, 1.74%
|
|
|
IEEE Access
26 publications, 1.37%
|
|
|
Applied Sciences (Switzerland)
21 publications, 1.11%
|
|
|
Lecture Notes in Networks and Systems
17 publications, 0.9%
|
|
|
Information Processing and Management
15 publications, 0.79%
|
|
|
SSRN Electronic Journal
14 publications, 0.74%
|
|
|
Electronics (Switzerland)
13 publications, 0.69%
|
|
|
Neurocomputing
13 publications, 0.69%
|
|
|
Expert Systems with Applications
13 publications, 0.69%
|
|
|
Journal of Medical Internet Research
12 publications, 0.63%
|
|
|
Scientific Reports
11 publications, 0.58%
|
|
|
Frontiers in Artificial Intelligence
11 publications, 0.58%
|
|
|
Information (Switzerland)
10 publications, 0.53%
|
|
|
Procedia Computer Science
10 publications, 0.53%
|
|
|
Mathematics
9 publications, 0.47%
|
|
|
npj Digital Medicine
9 publications, 0.47%
|
|
|
ACM Computing Surveys
9 publications, 0.47%
|
|
|
AI and Society
8 publications, 0.42%
|
|
|
Knowledge-Based Systems
8 publications, 0.42%
|
|
|
JMIR Medical Education
7 publications, 0.37%
|
|
|
Bioinformatics
7 publications, 0.37%
|
|
|
Computers and Education Artificial Intelligence
7 publications, 0.37%
|
|
|
IEEE Transactions on Visualization and Computer Graphics
7 publications, 0.37%
|
|
|
Cureus
6 publications, 0.32%
|
|
|
Nature
6 publications, 0.32%
|
|
|
Journal of Biomedical Informatics
6 publications, 0.32%
|
|
|
International Journal of Human-Computer Interaction
6 publications, 0.32%
|
|
|
ACM Transactions on Software Engineering and Methodology
6 publications, 0.32%
|
|
|
20
40
60
80
100
120
140
|
Publishers
|
50
100
150
200
250
300
350
400
450
|
|
|
Springer Nature
429 publications, 22.61%
|
|
|
Institute of Electrical and Electronics Engineers (IEEE)
339 publications, 17.87%
|
|
|
Elsevier
298 publications, 15.71%
|
|
|
Association for Computing Machinery (ACM)
278 publications, 14.65%
|
|
|
MDPI
126 publications, 6.64%
|
|
|
Wiley
56 publications, 2.95%
|
|
|
Taylor & Francis
39 publications, 2.06%
|
|
|
JMIR Publications
38 publications, 2%
|
|
|
Frontiers Media S.A.
37 publications, 1.95%
|
|
|
SAGE
27 publications, 1.42%
|
|
|
Oxford University Press
27 publications, 1.42%
|
|
|
Ovid Technologies (Wolters Kluwer Health)
21 publications, 1.11%
|
|
|
Cold Spring Harbor Laboratory
19 publications, 1%
|
|
|
Emerald
10 publications, 0.53%
|
|
|
American Chemical Society (ACS)
8 publications, 0.42%
|
|
|
Social Science Electronic Publishing
7 publications, 0.37%
|
|
|
IOP Publishing
7 publications, 0.37%
|
|
|
IGI Global
6 publications, 0.32%
|
|
|
Cambridge University Press
6 publications, 0.32%
|
|
|
American Medical Association (AMA)
5 publications, 0.26%
|
|
|
MIT Press
5 publications, 0.26%
|
|
|
Public Library of Science (PLoS)
5 publications, 0.26%
|
|
|
Institute for Operations Research and the Management Sciences (INFORMS)
4 publications, 0.21%
|
|
|
BMJ
4 publications, 0.21%
|
|
|
Walter de Gruyter
4 publications, 0.21%
|
|
|
Royal Society of Chemistry (RSC)
4 publications, 0.21%
|
|
|
American Association for the Advancement of Science (AAAS)
3 publications, 0.16%
|
|
|
The Royal Society
3 publications, 0.16%
|
|
|
ASME International
3 publications, 0.16%
|
|
|
50
100
150
200
250
300
350
400
450
|
- We do not take into account publications without a DOI.
- Statistics recalculated weekly.