,
pages 3-18
CORKI: A Correlation-Driven Imputation Method for Partial Annotation Scenarios in Multi-label Clinical Problems
Ricardo Santos
1, 2
,
Bruno Ribeiro
1
,
Isabel Curioso
1
,
Marília Barandas
1, 2
,
André V. Carreiro
1
,
Hugo Gamboa
1, 2
,
Pedro Coelho
3, 4
,
José Fragata
3, 4
,
Inês Sousa
1
2
LIBPhys-UNL, NOVA School of Science and Technology, Caparica, Portugal
|
3
Comprehensive Health Research Center, NOVA Medical School, Lisboa, Portugal
|
4
Hospital de Santa Marta, Centro Hospitalar Universitário Lisboa Central, Lisboa, Portugal
|
Publication type: Book Chapter
Publication date: 2025-01-01
scimago Q4
SJR: 0.182
CiteScore: 1.1
Impact factor: —
ISSN: 18650929, 18650937
Abstract
Multi-label classification tasks are relevant in healthcare, as data samples are commonly associated with multiple interdependent, non-mutually exclusive outcomes. Incomplete label information often arises due to unrecorded outcomes at planned checkpoints, varying disease testing across patients, collection constraints, or human error. Dropping partially annotated samples can reduce data size, introduce bias, and compromise accuracy. To address these issues, this study introduces CORKI (Correlation-Optimised and Robust K Nearest Neighbours Imputation for Multi-label Classification), a data-centric method for partial annotation imputation in Multi-label data. This method employs proximity measures and an optional weighting term for outcome prevalence to tackle imbalanced labels. Additionally, it leverages different modalities of correlation that consider not only variable values but also missingness patterns. CORKI’s performance was compared with a domain-knowledge-based rule system and the standard sample-dropping approach on three public and one private cardiothoracic surgery datasets with diverse missing label rates. CORKI yielded performances comparable to those of the domain-knowledge approach, establishing itself as a reliable method, while being highly generalizable. Moreover, it was able to maintain imputation accuracy in demanding partial annotation scenarios, presenting drops of only 5% for missing rates of 50%.
Found
Nothing found, try to update filter.
Are you a researcher?
Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
0
Total citations:
0
Cite this
GOST |
RIS |
BibTex
Cite this
GOST
Copy
Santos R. et al. CORKI: A Correlation-Driven Imputation Method for Partial Annotation Scenarios in Multi-label Clinical Problems // Communications in Computer and Information Science. 2025. pp. 3-18.
GOST all authors (up to 50)
Copy
Santos R., Ribeiro B., Curioso I., Barandas M., V. Carreiro A., Gamboa H., Coelho P., Fragata J., Sousa I. CORKI: A Correlation-Driven Imputation Method for Partial Annotation Scenarios in Multi-label Clinical Problems // Communications in Computer and Information Science. 2025. pp. 3-18.
Cite this
RIS
Copy
TY - GENERIC
DO - 10.1007/978-3-031-74640-6_1
UR - https://link.springer.com/10.1007/978-3-031-74640-6_1
TI - CORKI: A Correlation-Driven Imputation Method for Partial Annotation Scenarios in Multi-label Clinical Problems
T2 - Communications in Computer and Information Science
AU - Santos, Ricardo
AU - Ribeiro, Bruno
AU - Curioso, Isabel
AU - Barandas, Marília
AU - V. Carreiro, André
AU - Gamboa, Hugo
AU - Coelho, Pedro
AU - Fragata, José
AU - Sousa, Inês
PY - 2025
DA - 2025/01/01
PB - Springer Nature
SP - 3-18
SN - 1865-0929
SN - 1865-0937
ER -
Cite this
BibTex (up to 50 authors)
Copy
@incollection{2025_Santos,
author = {Ricardo Santos and Bruno Ribeiro and Isabel Curioso and Marília Barandas and André V. Carreiro and Hugo Gamboa and Pedro Coelho and José Fragata and Inês Sousa},
title = {CORKI: A Correlation-Driven Imputation Method for Partial Annotation Scenarios in Multi-label Clinical Problems},
publisher = {Springer Nature},
year = {2025},
pages = {3--18},
month = {jan}
}