Multi-label Classification of Pure Code

Publication typeJournal Article
Publication date2024-07-29
scimago Q3
wos Q4
SJR0.206
CiteScore1.8
Impact factor0.6
ISSN02181940, 17936403
Abstract

Currently, there is a significant amount of public code in the IT communities, programming forums and code repositories. Many of these codes lack classification labels, or have imprecise labels, which causes inconvenience to code management and retrieval. Some classification methods have been proposed to automatically assign labels to the code. However, these methods mainly rely on code comments or surrounding text, and the classification effect is limited by the quality of them. So far, there are a few methods that rely solely on the code itself to assign labels to the code. In this paper, an encoder-only method is proposed to assign multiple labels to the code of an algorithmic problem, in which UniXcoder is employed to encode the input code and the encoding results correspond to the output labels through the classification heads. The proposed method relies only on the code itself. We construct a dataset to evaluate the proposed method, which consists of source code in three programming languages (C[Formula: see text], Java, Python) with a total size of approximately 120[Formula: see text]K. The results of the comparative experiment show that the proposed method has better performance in multi-label classification task of pure code than encoder–decoder methods.

Found 
Found 

Top-30

Journals

1
Expert Systems with Applications
1 publication, 100%
1

Publishers

1
Elsevier
1 publication, 100%
1
  • We do not take into account publications without a DOI.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
1
Share
Cite this
GOST |
Cite this
GOST Copy
Gao B. et al. Multi-label Classification of Pure Code // International Journal of Software Engineering and Knowledge Engineering. 2024. Vol. 34. No. 10. pp. 1641-1659.
GOST all authors (up to 50) Copy
Gao B., Qin H., Ma X. Multi-label Classification of Pure Code // International Journal of Software Engineering and Knowledge Engineering. 2024. Vol. 34. No. 10. pp. 1641-1659.
RIS |
Cite this
RIS Copy
TY - JOUR
DO - 10.1142/s0218194024500311
UR - https://www.worldscientific.com/doi/10.1142/S0218194024500311
TI - Multi-label Classification of Pure Code
T2 - International Journal of Software Engineering and Knowledge Engineering
AU - Gao, Bin
AU - Qin, Hongwu
AU - Ma, Xiuqin
PY - 2024
DA - 2024/07/29
PB - World Scientific
SP - 1641-1659
IS - 10
VL - 34
SN - 0218-1940
SN - 1793-6403
ER -
BibTex |
Cite this
BibTex (up to 50 authors) Copy
@article{2024_Gao,
author = {Bin Gao and Hongwu Qin and Xiuqin Ma},
title = {Multi-label Classification of Pure Code},
journal = {International Journal of Software Engineering and Knowledge Engineering},
year = {2024},
volume = {34},
publisher = {World Scientific},
month = {jul},
url = {https://www.worldscientific.com/doi/10.1142/S0218194024500311},
number = {10},
pages = {1641--1659},
doi = {10.1142/s0218194024500311}
}
MLA
Cite this
MLA Copy
Gao, Bin, et al. “Multi-label Classification of Pure Code.” International Journal of Software Engineering and Knowledge Engineering, vol. 34, no. 10, Jul. 2024, pp. 1641-1659. https://www.worldscientific.com/doi/10.1142/S0218194024500311.