Chemical Engineering Journal, volume 405, pages 126627
Shedding light on “Black Box” machine learning models for predicting the reactivity of HO radicals toward organic compounds
Publication type: Journal Article
Publication date: 2021-02-01
Journal:
Chemical Engineering Journal
scimago Q1
wos Q1
SJR: 2.852
CiteScore: 21.7
Impact factor: 13.3
ISSN: 13858947, 03009467
General Chemistry
General Chemical Engineering
Industrial and Manufacturing Engineering
Environmental Chemistry
Abstract
• MF-ML assisted-QSAR model was developed for 1089 compounds toward HO reactivity. • An ensemble model that combined XGBoost and DNN was developed. • The SHAP method was used to interpret all the obtained models. • The model made predictions based on the chemical knowledge correctly “learned”. Developing quantitative structure-activity relationships (QSARs) is an important approach to predicting the reactivity of HO radicals toward newly emerged organic compounds. As compared with molecular descriptors-based and the group contribution method-based QSARs, a combined molecular fingerprint-machine learning (ML) method can more quickly and accurately develop such models for a growing number of contaminants. However, it is yet unknown whether this method makes predictions by choosing meaningful structural features rather than spurious ones, which is vital for trusting the models. In this study, we developed QSAR models for the log k HO values of 1089 organic compounds in the aqueous phase by two ML algorithms—deep neural networks (DNN) and eXtreme Gradient Boosting (XGBoost), and interpreted the built models by the SHapley Additive exPlanations (SHAP) method. The results showed that for the contribution of a given structural feature to log k HO for different compounds, DNN and XGBoost treated it as a fixed and variable value, respectively. We then developed an ensemble model combining the DNN with XGBoost, which achieved satisfactory predictive performance for all three datasets: Training dataset: R-square ( R 2 ) 0.89–0.91, root-mean-squared-error ( RMSE ) 0.21–0.23, and mean absolute error ( MAE ) 0.15–0.17; Validation dataset: R 2 0.63–0.78, RMSE 0.29–0.32, and MAE 0.21–0.25; and Test dataset: R 2 0.60–0.71, RMSE 0.30–0.35, and MAE 0.23–0.25. The SHAP method was further used to unveil that this ensemble model made predictions on log k HO based on a correct ‘understanding’ of the impact of electron-withdrawing and -donating groups and of the reactive sites in the compounds that can be attacked by HO . This study offered some much-needed mechanistic insights into a ML-assisted environmental task, which are important for evaluating the trustworthiness of the ML-based models, further improving the models for specific applications, and leveraging the implicit knowledge the models carry.
Found
Found
Top-30
Journals
2
4
6
8
10
12
14
16
18
|
|
Environmental Science & Technology
17 publications, 18.09%
|
|
Journal of Hazardous Materials
7 publications, 7.45%
|
|
Chemosphere
5 publications, 5.32%
|
|
Chemical Engineering Journal
4 publications, 4.26%
|
|
ACS ES&T Engineering
4 publications, 4.26%
|
|
Journal of Hydrology
3 publications, 3.19%
|
|
Water Research
3 publications, 3.19%
|
|
Environmental Science and Technology Letters
3 publications, 3.19%
|
|
Separation and Purification Technology
3 publications, 3.19%
|
|
Atmospheric Chemistry and Physics
2 publications, 2.13%
|
|
ACS ES&T Water
2 publications, 2.13%
|
|
Journal of Cleaner Production
2 publications, 2.13%
|
|
Chemical Engineering Science
2 publications, 2.13%
|
|
Atmospheric Environment
2 publications, 2.13%
|
|
Environmental Research
2 publications, 2.13%
|
|
Combustion and Flame
2 publications, 2.13%
|
|
Science of the Total Environment
2 publications, 2.13%
|
|
Process Safety and Environmental Protection
2 publications, 2.13%
|
|
Water Resources Management
1 publication, 1.06%
|
|
Chemical Engineering Journal Advances
1 publication, 1.06%
|
|
Fuel
1 publication, 1.06%
|
|
Materials Today Communications
1 publication, 1.06%
|
|
Journal of Environmental Management
1 publication, 1.06%
|
|
Analytica Chimica Acta
1 publication, 1.06%
|
|
Journal of Materials Chemistry A
1 publication, 1.06%
|
|
Reaction Chemistry and Engineering
1 publication, 1.06%
|
|
IEEE Sensors Journal
1 publication, 1.06%
|
|
Chemical Reviews
1 publication, 1.06%
|
|
Journal of Chemical Information and Modeling
1 publication, 1.06%
|
|
2
4
6
8
10
12
14
16
18
|
Publishers
10
20
30
40
50
60
|
|
Elsevier
53 publications, 56.38%
|
|
American Chemical Society (ACS)
29 publications, 30.85%
|
|
Royal Society of Chemistry (RSC)
3 publications, 3.19%
|
|
Copernicus
2 publications, 2.13%
|
|
Springer Nature
2 publications, 2.13%
|
|
Institute of Electrical and Electronics Engineers (IEEE)
2 publications, 2.13%
|
|
MDPI
1 publication, 1.06%
|
|
AIP Publishing
1 publication, 1.06%
|
|
10
20
30
40
50
60
|
- We do not take into account publications without a DOI.
- Statistics recalculated only for publications connected to researchers, organizations and labs registered on the platform.
- Statistics recalculated weekly.
Are you a researcher?
Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
Cite this
GOST |
RIS |
BibTex
Cite this
GOST
Copy
Zhong S. et al. Shedding light on “Black Box” machine learning models for predicting the reactivity of HO radicals toward organic compounds // Chemical Engineering Journal. 2021. Vol. 405. p. 126627.
GOST all authors (up to 50)
Copy
Zhong S., Zhang K., Wang D., Zhang H. Shedding light on “Black Box” machine learning models for predicting the reactivity of HO radicals toward organic compounds // Chemical Engineering Journal. 2021. Vol. 405. p. 126627.
Cite this
RIS
Copy
TY - JOUR
DO - 10.1016/j.cej.2020.126627
UR - https://doi.org/10.1016/j.cej.2020.126627
TI - Shedding light on “Black Box” machine learning models for predicting the reactivity of HO radicals toward organic compounds
T2 - Chemical Engineering Journal
AU - Zhong, Shifa
AU - Zhang, Kai
AU - Wang, Dong
AU - Zhang, Huichun
PY - 2021
DA - 2021/02/01
PB - Elsevier
SP - 126627
VL - 405
SN - 1385-8947
SN - 0300-9467
ER -
Cite this
BibTex (up to 50 authors)
Copy
@article{2021_Zhong,
author = {Shifa Zhong and Kai Zhang and Dong Wang and Huichun Zhang},
title = {Shedding light on “Black Box” machine learning models for predicting the reactivity of HO radicals toward organic compounds},
journal = {Chemical Engineering Journal},
year = {2021},
volume = {405},
publisher = {Elsevier},
month = {feb},
url = {https://doi.org/10.1016/j.cej.2020.126627},
pages = {126627},
doi = {10.1016/j.cej.2020.126627}
}