volume 9 issue 4 pages 602-614

Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide

Iain Marshall 1
A. H. Noel-Storr 2
Joël Kuiper 3
James Thomas 4
Byron Wallace 5
Publication typeJournal Article
Publication date2018-02-07
scimago Q1
wos Q1
SJR4.020
CiteScore17.5
Impact factor6.1
ISSN17592879, 17592887
PubMed ID:  29314757
Education
Abstract
Machine learning (ML) algorithms have proven highly accurate for identifying Randomized Controlled Trials (RCTs) but are not used much in practice, in part because the best way to make use of the technology in a typical workflow is unclear. In this work, we evaluate ML models for RCT classification (support vector machines, convolutional neural networks, and ensemble approaches). We trained and optimized support vector machine and convolutional neural network models on the titles and abstracts of the Cochrane Crowd RCT set. We evaluated the models on an external dataset (Clinical Hedges), allowing direct comparison with traditional database search filters. We estimated area under receiver operating characteristics (AUROC) using the Clinical Hedges dataset. We demonstrate that ML approaches better discriminate between RCTs and non-RCTs than widely used traditional database search filters at all sensitivity levels; our best-performing model also achieved the best results to date for ML in this task (AUROC 0.987, 95% CI, 0.984-0.989). We provide practical guidance on the role of ML in (1) systematic reviews (high-sensitivity strategies) and (2) rapid reviews and clinical question answering (high-precision strategies) together with recommended probability cutoffs for each use case. Finally, we provide open-source software to enable these approaches to be used in practice.
Found 
Found 

Top-30

Journals

20
40
60
80
100
120
140
160
180
200
Cochrane Database of Systematic Reviews
183 publications, 53.35%
Journal of Clinical Epidemiology
11 publications, 3.21%
Research Synthesis Methods
10 publications, 2.92%
Systematic Reviews
7 publications, 2.04%
BMJ
6 publications, 1.75%
BMJ Open
5 publications, 1.46%
Journal of Biomedical Informatics
4 publications, 1.17%
JMIR Medical Informatics
3 publications, 0.87%
BMC Medical Research Methodology
3 publications, 0.87%
Campbell Systematic Reviews
3 publications, 0.87%
BMJ Medicine
3 publications, 0.87%
Journal of Medical Internet Research
2 publications, 0.58%
Clinical Science
2 publications, 0.58%
PLoS ONE
2 publications, 0.58%
Artificial Intelligence Review
2 publications, 0.58%
Journal of the Society for Social Work and Research
1 publication, 0.29%
HRB Open Research
1 publication, 0.29%
PharmacoEconomics
1 publication, 0.29%
Journal of Clinical Nursing
1 publication, 0.29%
International Journal of Stroke
1 publication, 0.29%
Journal of Advanced Oral Research
1 publication, 0.29%
Children
1 publication, 0.29%
Frontiers in Medicine
1 publication, 0.29%
Infection
1 publication, 0.29%
BMC Medical Informatics and Decision Making
1 publication, 0.29%
International Journal of Behavioral Nutrition and Physical Activity
1 publication, 0.29%
Mayo Clinic Proceedings
1 publication, 0.29%
Semergen
1 publication, 0.29%
Decision Analytics Journal
1 publication, 0.29%
20
40
60
80
100
120
140
160
180
200

Publishers

50
100
150
200
250
Wiley
205 publications, 59.77%
Elsevier
32 publications, 9.33%
Springer Nature
25 publications, 7.29%
BMJ
17 publications, 4.96%
Cold Spring Harbor Laboratory
15 publications, 4.37%
JMIR Publications
8 publications, 2.33%
Taylor & Francis
5 publications, 1.46%
MDPI
4 publications, 1.17%
SAGE
3 publications, 0.87%
Public Library of Science (PLoS)
3 publications, 0.87%
Oxford University Press
3 publications, 0.87%
Cambridge University Press
3 publications, 0.87%
Portland Press
2 publications, 0.58%
PeerJ
2 publications, 0.58%
University of Chicago Press
1 publication, 0.29%
F1000 Research
1 publication, 0.29%
Frontiers Media S.A.
1 publication, 0.29%
IOP Publishing
1 publication, 0.29%
Johann Ambrosius Barth
1 publication, 0.29%
Royal College of General Practitioners
1 publication, 0.29%
Pan American Health Organization
1 publication, 0.29%
Emerald
1 publication, 0.29%
American College of Physicians
1 publication, 0.29%
American Educational Research Association
1 publication, 0.29%
American Society of Clinical Oncology (ASCO)
1 publication, 0.29%
OOO Zhurnal "Mendeleevskie Soobshcheniya"
1 publication, 0.29%
Ovid Technologies (Wolters Kluwer Health)
1 publication, 0.29%
50
100
150
200
250
  • We do not take into account publications without a DOI.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
343
Share
Cite this
GOST |
Cite this
GOST Copy
Marshall I. et al. Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide // Research Synthesis Methods. 2018. Vol. 9. No. 4. pp. 602-614.
GOST all authors (up to 50) Copy
Marshall I., Noel-Storr A. H., Kuiper J., Thomas J., Wallace B. Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide // Research Synthesis Methods. 2018. Vol. 9. No. 4. pp. 602-614.
RIS |
Cite this
RIS Copy
TY - JOUR
DO - 10.1002/jrsm.1287
UR - https://doi.org/10.1002/jrsm.1287
TI - Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide
T2 - Research Synthesis Methods
AU - Marshall, Iain
AU - Noel-Storr, A. H.
AU - Kuiper, Joël
AU - Thomas, James
AU - Wallace, Byron
PY - 2018
DA - 2018/02/07
PB - Wiley
SP - 602-614
IS - 4
VL - 9
PMID - 29314757
SN - 1759-2879
SN - 1759-2887
ER -
BibTex |
Cite this
BibTex (up to 50 authors) Copy
@article{2018_Marshall,
author = {Iain Marshall and A. H. Noel-Storr and Joël Kuiper and James Thomas and Byron Wallace},
title = {Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide},
journal = {Research Synthesis Methods},
year = {2018},
volume = {9},
publisher = {Wiley},
month = {feb},
url = {https://doi.org/10.1002/jrsm.1287},
number = {4},
pages = {602--614},
doi = {10.1002/jrsm.1287}
}
MLA
Cite this
MLA Copy
Marshall, Iain, et al. “Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide.” Research Synthesis Methods, vol. 9, no. 4, Feb. 2018, pp. 602-614. https://doi.org/10.1002/jrsm.1287.