Analytical Biochemistry, volume 695, pages 115654
The development of machine learning approaches in two-dimensional NMR data interpretation for metabolomics applications
Julie Pollak
1
,
Moses Mayonu
1
,
Lin Jiang
2
,
Bo Wang
3
Publication type: Journal Article
Publication date: 2024-12-01
Journal:
Analytical Biochemistry
scimago Q3
wos Q2
SJR: 0.493
CiteScore: 5.7
Impact factor: 2.6
ISSN: 00032697, 10960309
Abstract
Metabolomics has been widely applied in human diseases and environmental science to study the systematic changes of metabolites over diverse types of stimuli. NMR-based metabolomics has been widely used, but the peak overlap problems in the one-dimensional (1D) NMR spectrum could limit the accuracy of quantitative analysis for metabolomics applications. Two-dimensional (2D) NMR has been applied to solve the 1D NMR overlap problem, but the data processing is still challenging. In this study, we built an automatic approach to process the 2D NMR data for quantitative applications using machine learning approaches. Partial least square discriminant analysis (PLS-DA), artificial neural network classification (ANN-DA), gradient boosted trees classification (XGBoost-DA), and artificial deep learning neural network classification (ANNDL-DA) were applied in combination with an automatic peak selection approach. Standard mixtures, sea anemone extracts, and mouse fecal samples were tested to demonstrate the approach. Our results showed that ANN-DA and ANNDL-DA have high accuracy in selecting 2D NMR peaks (around 90 %), which have a high potential application in 2D NMR-based metabolomics quantitively study, while PLS-DA and XGBoost-DA showed limitations in either data variation or overfitting. Our study built an automatic approach to applying 2D NMR data to routine quantitative analysis in metabolomics.
Are you a researcher?
Create a profile to get free access to personal recommendations for colleagues and new articles.