TransQAM: Transformer-based Question Answering System in Malayalam
Question Answering (QA) systems are used to extract the exact answer from a given context. In this study, we have implemented a QA system named TransQAM with BERT and its variants for the low-resource Malayalam language. We have considered the transformer models, namely, BERT, multilingual BERT, XLM-RoBERTa, and MuRIL for implementation. Since there is no publicly available Malayalam dataset for QA, we have built and made publicly available a sufficiently large Malayalam QA dataset in SQuAD (Stanford Question Answering Dataset) format with 30k question-answer pairs. We have obtained state-of-the-art results for TransQAM implemented using MuRIL. Due to the advancement of language models and active research, many languages, such as English, have well-developed QA systems compared to the unexplored Malayalam language. According to our knowledge, TransQAM is the first QA system in Malayalam that successfully applies transformer models to answer questions and achieves more than 80% accuracy.