Open Access
Vectorized Highly Parallel Density-based Clustering for Applications with Noise
Joseph Arnold Xavier
1
,
Juan Pedro Gutiérrez Hermosillo Muriedas
2
,
Juan Pedro Gutiérrez Hermosillo Muriedas
2
,
Stepan Nassyr
1
,
Rocco Sedona
1
,
Markus Götz
2
,
Achim Streit
2
,
Morris Riedel
1, 3
,
Gabriele Cavallaro
1
1
Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich, Jülich, Germany
|
Publication type: Journal Article
Publication date: 2024-11-27
scimago Q1
wos Q2
SJR: 0.849
CiteScore: 9.0
Impact factor: 3.6
ISSN: 21693536
Abstract
Clustering in data mining involves grouping similar objects into categories based on their characteristics. As the volume of data continues to grow and advancements in high-performance computing evolve, a critical need has emerged for algorithms that can efficiently process these computations and exploit the various levels of parallelism offered by modern supercomputing systems. Exploiting Single Instruction Multiple Data (SIMD) instructions enhances parallelism at the instruction level and minimizes data movement within the memory hierarchy. To fully harness a processor’s SIMD capabilities and achieve optimal performance, adapting algorithms for better compatibility with vector operations is necessary. In this paper, we introduce a vectorized implementation of the Density-based Clustering for Applications with Noise (DBSCAN) algorithm suitable for the execution on both shared and distributed memory systems. By leveraging SIMD, we enhance the performance of distance computations. Our proposed Vectorized HPDBSCAN (VHPDBSCAN) demonstrates a performance improvement of up to two times over the state-of-the-art parallel version, Highly Parallel DBSCAN (HPDBSCAN), on the ARM-based A64FX processor on two different datasets with varying dimensions. We have parallelized computations which are essential for the efficient workload distribution. This has significantly enhanced the performance on higher dimensional datasets. Additionally, we evaluate VHPDBSCAN’s energy consumption on the A64FX and Intel Xeon processors. The results show that in both processors, due to the reduced runtime, the total energy consumption of the application is reduced by 50% on the A64FX Central Processing Unit (CPU) and by approximately 19% on the Intel Xeon 8368 CPU compared to HPDBSCAN.
Found
Nothing found, try to update filter.
Are you a researcher?
Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
0
Total citations:
0
Cite this
GOST |
RIS |
BibTex
Cite this
GOST
Copy
Xavier J. A. et al. Vectorized Highly Parallel Density-based Clustering for Applications with Noise // IEEE Access. 2024. Vol. 12. pp. 181679-181692.
GOST all authors (up to 50)
Copy
Xavier J. A., Muriedas J. P. G. H., Gutiérrez Hermosillo Muriedas J. P., Nassyr S., Sedona R., Götz M., Streit A., Riedel M., Cavallaro G. Vectorized Highly Parallel Density-based Clustering for Applications with Noise // IEEE Access. 2024. Vol. 12. pp. 181679-181692.
Cite this
RIS
Copy
TY - JOUR
DO - 10.1109/access.2024.3507193
UR - https://ieeexplore.ieee.org/document/10769413/
TI - Vectorized Highly Parallel Density-based Clustering for Applications with Noise
T2 - IEEE Access
AU - Xavier, Joseph Arnold
AU - Muriedas, Juan Pedro Gutiérrez Hermosillo
AU - Gutiérrez Hermosillo Muriedas, Juan Pedro
AU - Nassyr, Stepan
AU - Sedona, Rocco
AU - Götz, Markus
AU - Streit, Achim
AU - Riedel, Morris
AU - Cavallaro, Gabriele
PY - 2024
DA - 2024/11/27
PB - Institute of Electrical and Electronics Engineers (IEEE)
SP - 181679-181692
VL - 12
SN - 2169-3536
ER -
Cite this
BibTex (up to 50 authors)
Copy
@article{2024_Xavier,
author = {Joseph Arnold Xavier and Juan Pedro Gutiérrez Hermosillo Muriedas and Juan Pedro Gutiérrez Hermosillo Muriedas and Stepan Nassyr and Rocco Sedona and Markus Götz and Achim Streit and Morris Riedel and Gabriele Cavallaro},
title = {Vectorized Highly Parallel Density-based Clustering for Applications with Noise},
journal = {IEEE Access},
year = {2024},
volume = {12},
publisher = {Institute of Electrical and Electronics Engineers (IEEE)},
month = {nov},
url = {https://ieeexplore.ieee.org/document/10769413/},
pages = {181679--181692},
doi = {10.1109/access.2024.3507193}
}