Open Access
Open access
том 25 издание 1 номер публикации 84

An effective multi-step feature selection framework for clinical outcome prediction using electronic medical records

Hongnian Wang 1, 2
Mingyang Zhang 3
Liyi Mai 4
Xin Li 5
Abdelouahab Bellou 4, 5, 6, 7
Lijuan Wu 4, 8
Тип публикацииJournal Article
Дата публикации2025-02-17
scimago Q1
wos Q2
БС2
SJR1.041
CiteScore7.4
Impact factor3.8
ISSN14726947
Краткое описание
Identifying key variables is essential for developing clinical outcome prediction models based on high-dimensional electronic medical records (EMR). However, despite the abundance of feature selection (FS) methods available, challenges remain in choosing the most appropriate method, deciding how many top-ranked variables to include, and ensuring these selections are meaningful from a medical perspective. We developed a practical multi-step feature selection (FS) framework that integrates data-driven statistical inference with a knowledge verification strategy. This framework was validated using two distinct EMR datasets targeting different clinical outcomes. The first cohort, sourced from the Medical Information Mart for Intensive Care III (MIMIC-III), focused on predicting acute kidney injury (AKI) in ICU patients. The second cohort, drawn from the MIMIC-IV Emergency Department (MIMIC-IV-ED), aimed to estimate in-hospital mortality (IHM) for patients transferred from the ED to the ICU. We employed various machine learning (ML) methods and conducted a comparative analysis considering accuracy, stability, similarity, and interpretability. The effectiveness of our FS framework was evaluated using discrimination and calibration metrics, with SHAP applied to enhance the interpretability of model decisions. Cohort 1 comprised 48,780 ICU encounters, of which 8,883 (18.21%) developed AKI. Cohort 2 included 29,197 transfers from the ED to the ICU, with 3,219 (11.03%) resulting in IHM. Among the ten ML methods evaluated, the tree-based ensemble method achieved the highest accuracy. As the number of top-ranking features increased, the models’ accuracy began to stabilize, while feature subset stability (considering sample variations) and inter-method feature similarity reached optimal levels, confirming the validity of the FS framework. The integration of interpretative methods and expert knowledge in the final step further improved feature interpretability. The FS framework effectively reduced the number of features (e.g., from 380 to 35 for Cohort 1, and from 273 to 54 for Cohort 2) without significantly affecting prediction performance (Delong test, p > 0.05). The multi-step FS method developed in this study successfully reduces the dimensionality of features in EMR while preserving the accuracy of clinical outcome prediction. Furthermore, it improves the interpretability of risk factors by incorporating expert knowledge validation.
Найдено 
Найдено 

Топ-30

Журналы

1
BMC Nephrology
1 публикация, 50%
Scientific Reports
1 публикация, 50%
1

Издатели

1
2
Springer Nature
2 публикации, 100%
1
2
  • Мы не учитываем публикации, у которых нет DOI.
  • Статистика публикаций обновляется еженедельно.

Вы ученый?

Создайте профиль, чтобы получать персональные рекомендации коллег, конференций и новых статей.
Метрики
2
Поделиться
Цитировать
ГОСТ |
Цитировать
Wang H. et al. An effective multi-step feature selection framework for clinical outcome prediction using electronic medical records // BMC Medical Informatics and Decision Making. 2025. Vol. 25. No. 1. 84
ГОСТ со всеми авторами (до 50) Скопировать
Wang H., Zhang M., Mai L., Li X., Bellou A., Wu L. An effective multi-step feature selection framework for clinical outcome prediction using electronic medical records // BMC Medical Informatics and Decision Making. 2025. Vol. 25. No. 1. 84
RIS |
Цитировать
TY - JOUR
DO - 10.1186/s12911-025-02922-y
UR - https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-025-02922-y
TI - An effective multi-step feature selection framework for clinical outcome prediction using electronic medical records
T2 - BMC Medical Informatics and Decision Making
AU - Wang, Hongnian
AU - Zhang, Mingyang
AU - Mai, Liyi
AU - Li, Xin
AU - Bellou, Abdelouahab
AU - Wu, Lijuan
PY - 2025
DA - 2025/02/17
PB - Springer Nature
IS - 1
VL - 25
SN - 1472-6947
ER -
BibTex
Цитировать
BibTex (до 50 авторов) Скопировать
@article{2025_Wang,
author = {Hongnian Wang and Mingyang Zhang and Liyi Mai and Xin Li and Abdelouahab Bellou and Lijuan Wu},
title = {An effective multi-step feature selection framework for clinical outcome prediction using electronic medical records},
journal = {BMC Medical Informatics and Decision Making},
year = {2025},
volume = {25},
publisher = {Springer Nature},
month = {feb},
url = {https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-025-02922-y},
number = {1},
pages = {84},
doi = {10.1186/s12911-025-02922-y}
}