Open Access
Lecture Notes in Computer Science, pages 71-85
Enhanced Rules Application Order to Stem Affixation, Reduplication and Compounding Words in Malay Texts
1
Strategic Research, CyberSecurity Malaysia, Seri Kembangan, Malaysia
|
Publication type: Book Chapter
Publication date: 2016-08-06
Journal:
Lecture Notes in Computer Science
Q2
SJR: 0.606
CiteScore: 2.6
Impact factor: —
ISSN: 03029743, 16113349, 18612075, 18612083
Abstract
Word stemmer is an automated program to remove affixes, clitics and particles from derived words based on morphological structures of specific natural languages. It has been widely used for text preprocessing in many artificial intelligence applications. Furthermore, the performance of word stemmer to correctly stem derived words has an influence to the performance of information retrieval, text mining and text categorization applications. Despite of various stemming approaches were proposed in the past research, the existing word stemmers for Malay language still suffer from stemming errors. Moreover, the existing word stemmers partially consider morphological structures of Malay language in which only focused on affixation words instead of affixation, reduplication and compounding words, simultaneously. Therefore, this paper proposes an enhanced word stemmer using rule-based affixes removal and dictionary lookup methods called enhanced rule application order that is able to stem affixation, reduplication and compounding words and at the same time, is able to address possible stemming errors. This paper also examines possible root causes of affixation, reduplication and compounding stemming errors that could happen during word stemming process. The experimental results indicate that the proposed word stemmer is able to stem affixation, reduplication and compounding words with better stemming accuracy by using enhanced rule application order.
Found
Found
Top-30
Journals
1
|
|
IEEE Access
1 publication, 50%
|
|
Smart Innovation, Systems and Technologies
1 publication, 50%
|
|
1
|
Publishers
1
|
|
Institute of Electrical and Electronics Engineers (IEEE)
1 publication, 50%
|
|
Springer Nature
1 publication, 50%
|
|
1
|
- We do not take into account publications without a DOI.
- Statistics recalculated only for publications connected to researchers, organizations and labs registered on the platform.
- Statistics recalculated weekly.
Are you a researcher?
Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
Cite this
GOST |
RIS |
BibTex
Cite this
GOST
Copy
Kassim M. N. et al. Enhanced Rules Application Order to Stem Affixation, Reduplication and Compounding Words in Malay Texts // Lecture Notes in Computer Science. 2016. pp. 71-85.
GOST all authors (up to 50)
Copy
Kassim M. N., MAAROF M. A., Zainal A., Abdul Wahab A. Enhanced Rules Application Order to Stem Affixation, Reduplication and Compounding Words in Malay Texts // Lecture Notes in Computer Science. 2016. pp. 71-85.
Cite this
RIS
Copy
TY - GENERIC
DO - 10.1007/978-3-319-42706-5_6
UR - https://doi.org/10.1007/978-3-319-42706-5_6
TI - Enhanced Rules Application Order to Stem Affixation, Reduplication and Compounding Words in Malay Texts
T2 - Lecture Notes in Computer Science
AU - Kassim, Mohamad Nizam
AU - MAAROF, MOHD AIZAINI
AU - Zainal, Anazida
AU - Abdul Wahab, Amirudin
PY - 2016
DA - 2016/08/06
PB - Springer Nature
SP - 71-85
SN - 0302-9743
SN - 1611-3349
SN - 1861-2075
SN - 1861-2083
ER -
Cite this
BibTex (up to 50 authors)
Copy
@incollection{2016_Kassim,
author = {Mohamad Nizam Kassim and MOHD AIZAINI MAAROF and Anazida Zainal and Amirudin Abdul Wahab},
title = {Enhanced Rules Application Order to Stem Affixation, Reduplication and Compounding Words in Malay Texts},
publisher = {Springer Nature},
year = {2016},
pages = {71--85},
month = {aug}
}