Detection of malicious javascript on an imbalanced dataset
Publication type: Journal Article
Publication date: 2021-03-01
scimago Q1
wos Q1
SJR: 0.795
CiteScore: 12.4
Impact factor: 7.6
ISSN: 21991073, 21991081, 25426605
Computer Science Applications
Computer Science (miscellaneous)
Hardware and Architecture
Information Systems
Artificial Intelligence
Software
Management of Technology and Innovation
Engineering (miscellaneous)
Abstract
In order to be able to detect new malicious JavaScript with low cost, methods with machine learning techniques have been proposed and gave positive results. These methods focus on achieving a light-weight filtering model that can quickly and precisely filter out malicious data for dynamic analysis. A method constructs a language model using Natural Language Processing techniques to represent the data in vector form from the source code for machine learning. This method has high score with the balanced dataset, however the experiment with an imbalanced dataset has not been done. Previous studies mainly focus on a balanced dataset, however the dataset is not representative of real-world data, and it rises questions in practical uses of the model. A good model that can have a high recall score with imbalanced dataset is needed for a good filter. To construct an efficient language model, and to deal with the data imbalance problem, we focus on oversampling techniques. In our research, our method is the first to use oversampling and machine learning to detect malicious JavaScript. The experimental result shows that our method can detect new malicious JavaScript more accurately and efficiently. Our model can quickly filter out malicious data for dynamic analysis. The best recall score achieves 0.72 with the Doc2Vec model. Our proposed method is shown to outperform the baseline method by 210% in terms of recal score with the same training time and test time per sample.
Found
Nothing found, try to update filter.
Found
Nothing found, try to update filter.
Top-30
Journals
|
1
2
3
|
|
|
IEEE Access
3 publications, 13.04%
|
|
|
Applied Sciences (Switzerland)
2 publications, 8.7%
|
|
|
Internet of Things
2 publications, 8.7%
|
|
|
Lecture Notes in Computer Science
2 publications, 8.7%
|
|
|
Journal of Information Processing
2 publications, 8.7%
|
|
|
Journal of Intelligent and Fuzzy Systems
1 publication, 4.35%
|
|
|
Sensors
1 publication, 4.35%
|
|
|
International Journal of Information Security
1 publication, 4.35%
|
|
|
Applied Soft Computing Journal
1 publication, 4.35%
|
|
|
Journal of King Saud University - Computer and Information Sciences
1 publication, 4.35%
|
|
|
Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering
1 publication, 4.35%
|
|
|
SSRN Electronic Journal
1 publication, 4.35%
|
|
|
Artificial Intelligence Review
1 publication, 4.35%
|
|
|
Lecture Notes in Networks and Systems
1 publication, 4.35%
|
|
|
ETRI Journal
1 publication, 4.35%
|
|
|
Computer Science Review
1 publication, 4.35%
|
|
|
1
2
3
|
Publishers
|
1
2
3
4
5
6
7
8
|
|
|
Springer Nature
8 publications, 34.78%
|
|
|
Institute of Electrical and Electronics Engineers (IEEE)
4 publications, 17.39%
|
|
|
MDPI
3 publications, 13.04%
|
|
|
Elsevier
3 publications, 13.04%
|
|
|
Information Processing Society of Japan
2 publications, 8.7%
|
|
|
SAGE
1 publication, 4.35%
|
|
|
King Saud University
1 publication, 4.35%
|
|
|
Wiley
1 publication, 4.35%
|
|
|
1
2
3
4
5
6
7
8
|
- We do not take into account publications without a DOI.
- Statistics recalculated weekly.
Are you a researcher?
Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
23
Total citations:
23
Citations from 2024:
7
(30.44%)
Cite this
GOST |
RIS |
BibTex
Cite this
GOST
Copy
Phung N. M., MIMURA M. Detection of malicious javascript on an imbalanced dataset // Internet of Things. 2021. Vol. 13. p. 100357.
GOST all authors (up to 50)
Copy
Phung N. M., MIMURA M. Detection of malicious javascript on an imbalanced dataset // Internet of Things. 2021. Vol. 13. p. 100357.
Cite this
RIS
Copy
TY - JOUR
DO - 10.1016/j.iot.2021.100357
UR - https://doi.org/10.1016/j.iot.2021.100357
TI - Detection of malicious javascript on an imbalanced dataset
T2 - Internet of Things
AU - Phung, Ngoc Minh
AU - MIMURA, Mamoru
PY - 2021
DA - 2021/03/01
PB - Springer Nature
SP - 100357
VL - 13
SN - 2199-1073
SN - 2199-1081
SN - 2542-6605
ER -
Cite this
BibTex (up to 50 authors)
Copy
@article{2021_Phung,
author = {Ngoc Minh Phung and Mamoru MIMURA},
title = {Detection of malicious javascript on an imbalanced dataset},
journal = {Internet of Things},
year = {2021},
volume = {13},
publisher = {Springer Nature},
month = {mar},
url = {https://doi.org/10.1016/j.iot.2021.100357},
pages = {100357},
doi = {10.1016/j.iot.2021.100357}
}