Open Access
Open access
volume 25 issue 1 publication number 253

Random forests for the analysis of matched case–control studies

Publication typeJournal Article
Publication date2024-08-01
scimago Q1
wos Q1
SJR1.190
CiteScore6.8
Impact factor3.3
ISSN14712105
Abstract
Background

Conditional logistic regression trees have been proposed as a flexible alternative to the standard method of conditional logistic regression for the analysis of matched case–control studies. While they allow to avoid the strict assumption of linearity and automatically incorporate interactions, conditional logistic regression trees may suffer from a relatively high variability. Further machine learning methods for the analysis of matched case–control studies are missing because conventional machine learning methods cannot handle the matched structure of the data.

Results

A random forest method for the analysis of matched case–control studies based on conditional logistic regression trees is proposed, which overcomes the issue of high variability. It provides an accurate estimation of exposure effects while being more flexible in the functional form of covariate effects. The efficacy of the method is illustrated in a simulation study and within an application to real-world data from a matched case–control study on the effect of regular participation in cervical cancer screening on the development of cervical cancer.

Conclusions

The proposed random forest method is a promising add-on to the toolbox for the analysis of matched case–control studies and addresses the need for machine-learning methods in this field. It provides a more flexible approach compared to the standard method of conditional logistic regression, but also compared to conditional logistic regression trees. It allows for non-linearity and the automatic inclusion of interaction effects and is suitable both for exploratory and explanatory analyses.

Found 
Found 

Top-30

Journals

1
Frontiers in Medicine
1 publication, 25%
BMJ Open
1 publication, 25%
Headache
1 publication, 25%
1

Publishers

1
Institute of Electrical and Electronics Engineers (IEEE)
1 publication, 25%
Frontiers Media S.A.
1 publication, 25%
BMJ
1 publication, 25%
Wiley
1 publication, 25%
1
  • We do not take into account publications without a DOI.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
4
Share
Cite this
GOST |
Cite this
GOST Copy
Schauberger G. et al. Random forests for the analysis of matched case–control studies // BMC Bioinformatics. 2024. Vol. 25. No. 1. 253
GOST all authors (up to 50) Copy
Schauberger G., Klug S. J., Berger M. Random forests for the analysis of matched case–control studies // BMC Bioinformatics. 2024. Vol. 25. No. 1. 253
RIS |
Cite this
RIS Copy
TY - JOUR
DO - 10.1186/s12859-024-05877-5
UR - https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-024-05877-5
TI - Random forests for the analysis of matched case–control studies
T2 - BMC Bioinformatics
AU - Schauberger, Gunther
AU - Klug, Stefanie J.
AU - Berger, Moritz
PY - 2024
DA - 2024/08/01
PB - Springer Nature
IS - 1
VL - 25
PMID - 39090608
SN - 1471-2105
ER -
BibTex
Cite this
BibTex (up to 50 authors) Copy
@article{2024_Schauberger,
author = {Gunther Schauberger and Stefanie J. Klug and Moritz Berger},
title = {Random forests for the analysis of matched case–control studies},
journal = {BMC Bioinformatics},
year = {2024},
volume = {25},
publisher = {Springer Nature},
month = {aug},
url = {https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-024-05877-5},
number = {1},
pages = {253},
doi = {10.1186/s12859-024-05877-5}
}