ACM Transactions on Asian and Low-Resource Language Information Processing, volume 24, issue 4, pages 1-28

A Survey of Document Stemming Algorithms in Information Retrieval Systems

Publication typeJournal Article
Publication date2025-03-23
scimago Q2
SJR0.535
CiteScore3.6
Impact factor1.8
ISSN23754699, 23754702
Abstract

With the increase in the growth and diversity of databases and the enormity of their contents, there has become an urgent need to find advanced techniques in Natural Language Processing (NLP) applications, especially in the field of Information Retrieval (IR). One of the most popular techniques that can improve information retrieval is the stemming of text documents. Given the importance of stemming for information retrieval systems, in this paper, we present a detailed study of the adopted stemming approaches and the working mechanism of the various algorithms that follow each approach. We analyzed and evaluated the most important algorithms by comparing them based on specific criteria, including their strength in stemming, their advantages, and the disadvantages of each. Based on this comparison, we can identify the weaknesses that each stemming algorithm suffers from. We mainly aim through the study that we conducted in this paper to try to overcome the weaknesses of these algorithms and take advantage of their most important advantages to develop a new more efficient stemming algorithm for the English language.

Found 
  • We do not take into account publications without a DOI.
  • Statistics recalculated only for publications connected to researchers, organizations and labs registered on the platform.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Share
Cite this
GOST | RIS | BibTex | MLA
Found error?