Open Access

Applied Sciences (Switzerland)

, volume 15 , issue 8 , pages 4243

Intrusion Detection Method Based on Preprocessing of Highly Correlated and Imbalanced Data

Serhii Semenov ¹

Magdalena Krupska-Klimczak ¹

Roman Czapla ¹

Beata Krzaczek ¹

Svitlana Gavrylenko ²

Vadym Poltoratskyi ²

Zozulia Vladislav ²

Hide authors affiliations Show authors affiliations: 2 affiliations

Institute of Security and Computer Science, University of National Education Commission, ul. Podchorążych 2, 30-084 Krakow, Poland |

Department of “Computer Engineering and Programming”, National Technical University «Kharkiv Polytechnic Institute», 61000 Kharkiv, Ukraine |

Publication type: Journal Article

Publication date: 2025-04-11

MDPI

Applied Sciences (Switzerland)

scimago Q2

wos Q2

SJR: 0.521

CiteScore: 5.5

Impact factor: 2.5

ISSN: 20763417

DOI: 10.3390/app15084243

Copy DOI

Abstract

This paper examines traditional machine learning algorithms, neural networks, and the benefits of utilizing ensemble models. Data preprocessing methods for improving the quality of classification models are considered. To balance the classes, Undersampling, Oversampling, and their combination (Over + Undersampling) algorithms are explored. A procedure for reducing feature correlation is proposed. Classification models based on meta-algorithms such as SVM, KNN Naive Bayes, Perceptron, Bagging, Random Forest, AdaBoost, and Gradient Boosting have been thoroughly investigated. The settings of the base classifiers and meta-algorithm parameters have been optimized. The best result was obtained by using an ensemble classifier based on the Random Forest algorithm. Thus, an intrusion detection method based on the preprocessing of highly correlated and imbalanced data has been proposed. The scientific novelty of the obtained results lies in the integrated use of the developed procedure for reducing feature correlation, the application of the SMOTEENN data balancing method, the selection of an appropriate classifier, and the fine tuning of its parameters. The integration of these procedures and methods resulted in a higher F1 score, reduced training time, and faster recognition speed for the model. This allows us to recommend this method for practical use to improve the quality of network intrusion detection.

Found

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.

PDF

Metrics

Cite this

GOST |

Cite this

GOST Copy

Semenov S. et al. Intrusion Detection Method Based on Preprocessing of Highly Correlated and Imbalanced Data // Applied Sciences (Switzerland). 2025. Vol. 15. No. 8. p. 4243.

GOST all authors (up to 50) Copy

Semenov S., Krupska-Klimczak M., Czapla R., Krzaczek B., Gavrylenko S., Poltoratskyi V., Vladislav Z. Intrusion Detection Method Based on Preprocessing of Highly Correlated and Imbalanced Data // Applied Sciences (Switzerland). 2025. Vol. 15. No. 8. p. 4243.

RIS |

Cite this

RIS Copy

TY - JOUR

DO - 10.3390/app15084243

UR - https://www.mdpi.com/2076-3417/15/8/4243

TI - Intrusion Detection Method Based on Preprocessing of Highly Correlated and Imbalanced Data

T2 - Applied Sciences (Switzerland)

AU - Semenov, Serhii

AU - Krupska-Klimczak, Magdalena

AU - Czapla, Roman

AU - Krzaczek, Beata

AU - Gavrylenko, Svitlana

AU - Poltoratskyi, Vadym

AU - Vladislav, Zozulia

PY - 2025

DA - 2025/04/11

PB - MDPI

SP - 4243

IS - 8

VL - 15

SN - 2076-3417

ER -

BibTex |

Cite this

BibTex (up to 50 authors) Copy

@article{2025_Semenov,

author = {Serhii Semenov and Magdalena Krupska-Klimczak and Roman Czapla and Beata Krzaczek and Svitlana Gavrylenko and Vadym Poltoratskyi and Zozulia Vladislav},

title = {Intrusion Detection Method Based on Preprocessing of Highly Correlated and Imbalanced Data},

journal = {Applied Sciences (Switzerland)},

year = {2025},

volume = {15},

publisher = {MDPI},

month = {apr},

url = {https://www.mdpi.com/2076-3417/15/8/4243},

number = {8},

pages = {4243},

doi = {10.3390/app15084243}

}

MLA

Cite this

MLA Copy

Semenov, Serhii, et al. “Intrusion Detection Method Based on Preprocessing of Highly Correlated and Imbalanced Data.” Applied Sciences (Switzerland), vol. 15, no. 8, Apr. 2025, p. 4243. https://www.mdpi.com/2076-3417/15/8/4243.

Publisher

MDPI

Journal

Applied Sciences (Switzerland)

scimago Q2

wos Q2

SJR

0.521

CiteScore

5.5

Impact factor

2.5

ISSN

20763417 (Electronic)