volume 5 issue 1 publication number 12

A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification

Kanish Shah 1
Henil Patel 1
Devanshi Sanghvi 1
Manan Shah 2
Publication typeJournal Article
Publication date2020-03-05
SJR
CiteScore
Impact factor
ISSN23654317, 23654325
Abstract
In the current generation, a huge amount of textual documents are generated and there is an urgent need to organize them in a proper structure so that classification can be performed and categories can be properly defined. The key technology for gaining the insights into a text information and organizing that information is known as text classification. The classes are then classified by determining the text types of the content. Based on different machine learning algorithms used in the current paper, the system of text classification is divided into four sections namely text pre-treatment, text representation, implementation of the classifier and classification. In this paper, a BBC news text classification system is designed. In the classifier implementation section, the authors separately chose and compared logistic regression, random forest and K-nearest neighbour as our classification algorithms. Then, these classifiers were tested, analysed and compared with each other and finally got a conclusion. The experimental conclusion shows that BBC news text classification model gets satisfying results on the basis of algorithms tested on the data set. The authors decided to show the comparison based on five parameters namely precision, accuracy, F1-score, support and confusion matrix. The classifier which gets the highest among all these parameters is termed as the best machine learning algorithm for the BBC news data set.
Found 
Found 

Top-30

Journals

5
10
15
20
25
IEEE Access
23 publications, 6.08%
Lecture Notes in Networks and Systems
17 publications, 4.5%
Applied Sciences (Switzerland)
8 publications, 2.12%
Lecture Notes in Computer Science
8 publications, 2.12%
PLoS ONE
5 publications, 1.32%
Sensors
4 publications, 1.06%
Multimedia Tools and Applications
4 publications, 1.06%
Procedia Computer Science
4 publications, 1.06%
Journal of Intelligent and Fuzzy Systems
3 publications, 0.79%
Arabian Journal for Science and Engineering
3 publications, 0.79%
Lecture Notes in Electrical Engineering
3 publications, 0.79%
Cryptology and Network Security with Machine Learning
3 publications, 0.79%
Scientific Reports
3 publications, 0.79%
ACM Transactions on Asian and Low-Resource Language Information Processing
2 publications, 0.53%
Sustainability
2 publications, 0.53%
Information (Switzerland)
2 publications, 0.53%
Augmented Human Research
2 publications, 0.53%
International Journal of Energy and Water Resources
2 publications, 0.53%
Visual Computing for Industry Biomedicine and Art
2 publications, 0.53%
Electronics (Switzerland)
2 publications, 0.53%
Expert Systems with Applications
2 publications, 0.53%
Journal of Physics: Conference Series
2 publications, 0.53%
Artificial Intelligence in Agriculture
2 publications, 0.53%
Natural Language Processing Journal
2 publications, 0.53%
Journal of Information Security and Applications
2 publications, 0.53%
Mathematical Biosciences and Engineering
2 publications, 0.53%
Computational Intelligence and Neuroscience
2 publications, 0.53%
Soft Computing
2 publications, 0.53%
Communications in Computer and Information Science
2 publications, 0.53%
5
10
15
20
25

Publishers

20
40
60
80
100
120
Institute of Electrical and Electronics Engineers (IEEE)
115 publications, 30.42%
Springer Nature
95 publications, 25.13%
Elsevier
55 publications, 14.55%
MDPI
29 publications, 7.67%
Association for Computing Machinery (ACM)
9 publications, 2.38%
Taylor & Francis
7 publications, 1.85%
SAGE
6 publications, 1.59%
Public Library of Science (PLoS)
6 publications, 1.59%
Hindawi Limited
6 publications, 1.59%
Wiley
5 publications, 1.32%
Emerald
4 publications, 1.06%
Tech Science Press
3 publications, 0.79%
IOP Publishing
3 publications, 0.79%
Frontiers Media S.A.
3 publications, 0.79%
Walter de Gruyter
3 publications, 0.79%
World Scientific
2 publications, 0.53%
AIP Publishing
2 publications, 0.53%
IGI Global
2 publications, 0.53%
PeerJ
2 publications, 0.53%
Aurora Group, s.r.o
2 publications, 0.53%
Alexandria University
1 publication, 0.26%
American Institute of Mathematical Sciences (AIMS)
1 publication, 0.26%
Privacy Enhancing Technologies Symposium Advisory Board
1 publication, 0.26%
Scientific Research Publishing
1 publication, 0.26%
EDP Sciences
1 publication, 0.26%
Research Square Platform LLC
1 publication, 0.26%
Arizona State University
1 publication, 0.26%
Ovid Technologies (Wolters Kluwer Health)
1 publication, 0.26%
Society of Petroleum Engineers
1 publication, 0.26%
20
40
60
80
100
120
  • We do not take into account publications without a DOI.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
378
Share
Cite this
GOST |
Cite this
GOST Copy
Shah K. et al. A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification // Augmented Human Research. 2020. Vol. 5. No. 1. 12
GOST all authors (up to 50) Copy
Shah K., Patel H., Sanghvi D., Shah M. A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification // Augmented Human Research. 2020. Vol. 5. No. 1. 12
RIS |
Cite this
RIS Copy
TY - JOUR
DO - 10.1007/s41133-020-00032-0
UR - https://doi.org/10.1007/s41133-020-00032-0
TI - A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification
T2 - Augmented Human Research
AU - Shah, Kanish
AU - Patel, Henil
AU - Sanghvi, Devanshi
AU - Shah, Manan
PY - 2020
DA - 2020/03/05
PB - Springer Nature
IS - 1
VL - 5
SN - 2365-4317
SN - 2365-4325
ER -
BibTex
Cite this
BibTex (up to 50 authors) Copy
@article{2020_Shah,
author = {Kanish Shah and Henil Patel and Devanshi Sanghvi and Manan Shah},
title = {A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification},
journal = {Augmented Human Research},
year = {2020},
volume = {5},
publisher = {Springer Nature},
month = {mar},
url = {https://doi.org/10.1007/s41133-020-00032-0},
number = {1},
pages = {12},
doi = {10.1007/s41133-020-00032-0}
}