Kansenshogaku zasshi

The Japanese Association for Infectious Diseases
ISSN: 03875911, 1884569X

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Years of issue
2024-2025
journal names
Kansenshogaku zasshi
Publications
5 297
Citations
7 520
h-index
19
Top-3 citing journals
Top-3 organizations
Kawasaki Medical School
Kawasaki Medical School (101 publications)
Nagasaki University
Nagasaki University (69 publications)
Top-3 countries
Japan (671 publications)
Italy (56 publications)
USA (15 publications)

Most cited in 5 years

Found 
from chars
Publications found: 2482
Principal Components Analysis: Row Scaling and Compositional Data
Brereton R.
Q1
Wiley
Journal of Chemometrics 2025 citations by CoLab: 0
Detection of Lead Chrome Green in Tea Based on Near‐Infrared Reflectance Spectroscopy
Jiang X., Cheng P., Ge K., Lv S., Liu Y.
Q1
Wiley
Journal of Chemometrics 2025 citations by CoLab: 0  |  Abstract
ABSTRACTTea color is a part of tea quality, and illegal addition of lead chrome green (LCG) to improve tea quality cannot be identified by human eyes. This paper is based on near‐infrared (NIR) reflectance spectroscopy to detect LCG stained tea and to investigate the feasibility of qualitative and quantitative methods. Firstly, the LCG in tea was qualitatively analyzed by partial least squares discriminant analysis (PLS‐DA), random forest (RF), and least squares support vector machine (LSSVM) classification models, and the results showed that the classification accuracy of LSSVM reached 100%. For quantitative analysis, Savitzky–Golay convolutional smoothing (S‐G) preprocessing combined with three feature extraction algorithms, namely, joint competitive adaptive weighted sampling (CARS), uninformative variable elimination (UVE), and successive projection algorithm (SPA), were used to build partial least squares (PLS), RF, and LSSVM regression models sequentially on the preprocessed data. The S‐G‐UVE‐LSSVM showed the best regression prediction ability in detecting LCG in tea, with a tested R2 of 0.96. These results show the feasibility of NIR spectroscopy for the detection of added LCG in tea.
Determination of Halitosis by Exhaled Breath Analysis Using Semiconductor Metal Oxide Sensors and Chemometric Methods
Saveliev M., Volchek A., Lavrenova G., Malay O., Grevtsev M., Jahatspanian I.
Q1
Wiley
Journal of Chemometrics 2025 citations by CoLab: 0  |  Abstract
ABSTRACTHalitosis is a condition associated with bad breath. Although halitosis is a disease in its own right, it is often a symptom of more serious diseases (diabetes mellitus, renal failure, azotemia, etc.). The currently used method for diagnosing halitosis is the organoleptic method, which relies on a trained specialist evaluating the patient's breath odor. This approach to diagnosing halitosis is subjective, uncomfortable for both patient and doctor, and necessitates the involvement of a specially trained professional. As an alternative, instrumental diagnostics employing metal oxide semiconductor (MOS) sensor arrays offer a promising avenue by enabling patient classification through predeveloped models. This paper considers the application of seven MOS sensors of different compositions at three different temperatures. Different methods of chemometric data analysis were applied: k‐nearest neighbors (kNN), decision trees (DT), support vector machine (SVM), logistic regression (LR), and projection on latent structures discrimination analysis (PLSDA). All applied methods demonstrated their effectiveness and achieved selectivity, sensitivity, and accuracy values exceeding 85%. Additionally, a combined classifier leveraging responses from all previously studied classifiers was explored, achieving near‐perfect classification accuracy.
A Multiple Linear Regression–Based Algorithm to Correct for Cosmic Rays in Raman Images
Mitsutake H., de Paula E., Bordallo H., Rutledge D.
Q1
Wiley
Journal of Chemometrics 2025 citations by CoLab: 0  |  Abstract
ABSTRACTRaman imaging is a powerful technique for simultaneously obtaining chemical and spatial information on diverse materials. One of the most common detectors used on Raman equipment is the charge coupled detector (CCD) due its high sensitivity. However, CCDs are also sensitive to cosmic rays, that generate very narrow and intense signals: cosmic ray spikes. Since these peaks can be very intense and numerous, it is important to eliminate them before any data analysis. Some methods to do this use comparison of neighboring pixels to identify spikes, but when using the line‐scanning acquisition mode, it is common that these spikes appear in two or more pixels close together. Thus, in this work, a new algorithm has been developed to correct for cosmic ray spikes in Raman images, based on multiple linear regression (MLR). This algorithm takes less than 1 min in images with more than 70,000 spectra and removes all spikes, even those at low intensity.
Multimodal Stacked Modeling for Simultaneous Detection of Nutrient Concentrations With Turbidity Correction
Nini M., Nohair M.
Q1
Wiley
Journal of Chemometrics 2025 citations by CoLab: 0  |  Abstract
ABSTRACTIn this paper, an innovative method for the simultaneous determination of nitrite, nitrate, and COD in water in the presence of turbidity as a source of noise in spectroscopic data has been investigated. UV–Vis absorption spectrometry and advanced machine learning are proposed to develop a stacking model, a sophisticated modeling approach that combines several basic models (PLS, Lasso, and Ridge regression) and a meta‐regressor (Random Forest regressor) to improve prediction accuracy by incorporating baseline correction and principal component analysis (PCA) to mitigate the effects of turbidity on spectroscopic data. After applying these corrections, a significant improvement was observed: The root mean square error (RMSE) and the mean absolute error (MAE) were significantly reduced, and the correlation coefficient (R2) between predicted and actual values of nitrite, nitrate, COD, and turbidity was greater than 0.96, for all compounds in the test data set, that demonstrate the ability of the proposed stacking model to accurately predict nutrient concentrations simultaneously, even in complex environments; the proposed model may provide a valuable alternative to wet chemical methods. Due to its high accuracy and fast response, the proposed model can be used as an algorithm for the construction of nutrient sensors. This paper highlights the importance of integrating advanced modeling and data correction techniques to improve the robustness and accuracy of predictive models in environmental chemistry, thus providing valuable information for environmental monitoring and management.
Comparison of Chemometric Explorative Multi‐Omics Data Analysis Methods Applied to a Mechanistic Pan‐Cancer Cell Model
Westerhuis J. ., Heintz‐Buschart A., Hoefsloot H. ., van der Kloet F. ., van der Ploeg G. ., White F. .
Q1
Wiley
Journal of Chemometrics 2025 citations by CoLab: 0  |  Abstract
ABSTRACTThe analysis of single cell multi‐omics data is a complex task, and many explorative data analysis methods are being used to draw information from such data. This paper compares several of these methods to visualize the output of a mechanistic model under various simulated conditions. The analysis methods include PCA, PARAFAC, ASCA, MASCARA, COVSCA, P‐ESCA, and PE‐ASCA. These techniques, applied to high‐dimensional data such as gene expression and protein levels, assess correlations across time series and experimental conditions. The study uses a complex mechanistic model of MCF10A cancer cells, simulating interactions between signaling pathways related to cell growth and division. Results show that while methods like PCA PARAFAC and ASCA reveal time‐dependent variations in protein data, mRNA data exhibit minimal systematic variation. MASCARA offers unique insights by identifying genes linked to specific pathways. This work highlights the potential and limitations of various data analysis methods in understanding multi‐omics data, particularly in single‐cell contexts where experimental variation and stochastic processes complicate interpretation.
Improving Vapor Pressure Prediction Through Integration of Multiple Molecular Representations: A Super Learner Approach
Hyun Nam J., Lee S., Jo S., Kim J., Lee J., Koo J., Lee B., Jeong K., Yu D.
Q1
Wiley
Journal of Chemometrics 2025 citations by CoLab: 0  |  Abstract
ABSTRACTAccurate prediction of vapor pressure is essential in chemical engineering, environmental science, and pharmaceutical development, impacting the volatility and stability of compounds. Traditional methods often fall short for complex and new molecular structures. This study introduces an advanced machine learning approach, integrating graph neural networks (GNNs), and CHEM‐BERT models to improve prediction accuracy. Utilizing the largest dataset to date, we derived comprehensive chemical descriptors and fingerprints. We evaluated 19 predictive models, including ridge regression, random forest, support vector regression, and feed‐forward neural networks, trained on diverse features like PaDEL and Morgan fingerprints, chemical descriptors, and Chem‐BERT embeddings. Central to our methodology is the super learner architecture, which combines 19 multiple models to enhance accuracy. The super learner achieved a root mean squared error (RMSE) of 0.8200, outperforming individual models and previous reports. These successful results highlight the effectiveness of integrating GNNs and Chem‐BERT for capturing detailed molecular information, setting a new benchmark for vapor pressure prediction. This study underscores the value of advanced machine learning techniques and comprehensive datasets, offering a robust tool for researchers and paving the way for future advancements in chemical property prediction.
Progress of Complex System Process Analysis Based on Modern Spectroscopy Combined With Chemometrics
Li M., Cai Q., Zhang T., Tang H., Li H.
Q1
Wiley
Journal of Chemometrics 2025 citations by CoLab: 0  |  Abstract
ABSTRACTIn recent years, the role of analytical chemistry has undergone a gradual transformation, evolving from a mere participant to a pivotal decision‐maker in process optimisation. This shift can be attributed to the advent of sophisticated analytical instrumentation, which has ushered in a new era of analytical capabilities. This article presents a review of the developments in the application of intelligent analysis techniques, including infrared (IR) spectroscopy, Raman spectroscopy, and laser‐induced breakdown spectroscopy (LIBS), in the processing of complex systems over the past decade. The review provides an introduction to the fundamental principles of these analytical techniques and examines the evolution of their instrumentation to accommodate online process monitoring. The analysis of spectral data in complex system processes represents a fundamental aspect of the attainment of on‐site quality monitoring, process optimisation and control. Accordingly, the review provides a comprehensive overview of the methodologies employed in process chemometrics, encompassing spectral preprocessing, feature selection, modelling techniques, and optimisation strategies for model performance. Furthermore, this article presents a summary of three intelligent spectral analysis tools, namely infrared spectroscopy, Raman spectroscopy, and LIBS, which are widely employed in process simulation, monitoring, optimisation, and control across multiple disciplines, including the environment, energy, biology, and food. The objective of this review is to provide a valuable reference point and guidance for the further promotion and utilisation of spectral intelligent analysis instruments, with the aim of promoting their in‐depth application and development in a greater number of fields.
Cell Culture Media and Raman Spectra Preprocessing Procedures Impact Glucose Chemometrics
Pavurala N., Madhavarao C., Lee J., Das J., Ashraf M., O'Connor T.
Q1
Wiley
Journal of Chemometrics 2025 citations by CoLab: 0  |  Abstract
ABSTRACTDeployment of process analytical technology tools such as Raman or IR spectroscopy and associated multivariate calibration models for process monitoring and control plays an important role in process automation and advanced manufacturing of pharmaceuticals. Preprocessing or preparation of the spectroscopic data is an important step in developing a multivariate calibration model. There are several ways available to preprocess the data and each may influence the calibration model performance differently. Here we investigated the influence of preprocessing procedures on the development and performance of the chemometric models to predict the glucose concentration in a bioreactor. Box–Behnken design of experiment (DOE) was used to generate the Raman spectroscopy data. Four factors were considered critical in the DOE—glucose, glutamine, glutamic acid, and antifoam concentration. Raman spectroscopy data were collected both with and without aeration conditions, independently from three cell culture media. For each medium, data consisted of calibration set (27 conditions) and model validation set (9 conditions) separately. Additionally, Raman data was also collected for certain DOE runs with increasing concentration of cell densities ranging from 0.5 × 10 E06/mL to 30 × 10 E06/mL under aerating conditions. Data from the three cell culture media were used separately to develop calibration models that used four different preprocessing procedures, namely, baseline correction (BLC), Savitzky–Golay smoothing (SGS), Savitzky–Golay derivative (SGD) and orthogonal signal correction (OSC). The preprocessing procedures were applied individually and in combinations to evaluate the calibration model parameters and the performance metrics. We further developed glucose calibration models based on partial least squares (PLS) regression with 1–3 principal components. The models developed with OSC procedure gave superior performance metrics with just one principal component across all three media. Models developed with other preprocessing procedures required two or more principal components to give comparable performance. Overall, the choice of preprocessing procedures affected the model performance.
Nonparametric Threshold Estimation of Autocorrelated Statistics in Multivariate Statistical Process Monitoring
Grimm T., Newhart K., Hering A.
Q1
Wiley
Journal of Chemometrics 2025 citations by CoLab: 0  |  Abstract
ABSTRACTMultivariate statistical process monitoring is commonly used to detect abnormal process behavior in real time. Multiple process variables are monitored simultaneously, and alarms are issued when monitoring statistics exceed a predetermined threshold. Traditional approaches use a parametric threshold based on the assumptions of independence and multivariate normality of the process data, which are often violated in complex processes with high sampling frequencies, leading to excessive false alarms. Some approaches for improved threshold selection have been proposed, but they assume independence of the monitoring statistics, which are often autocorrelated. In this paper, we compare the performance of nonparametric estimators for computing thresholds from autocorrelated monitoring statistics through simulation. The false alarm rate and in‐control average run length of each estimator under different distributions, sample sizes, and autocorrelation levels and types are found. Estimator performance is found to depend on sample size and the strength of autocorrelation. The class of kernel density estimation (KDE) methods tends to perform better than estimators that use bootstrapping, and the proposed adjusted KDE methods that account for autocorrelation are recommended for general use. A case study to monitor a wastewater treatment facility further illustrates the performance of nonparametric and parametric thresholds when applied to real‐world systems.
An Alignment‐Agnostic Methodology for the Analysis of Designed Separations Data
Sorochan Armstrong M., Camacho J.
Q1
Wiley
Journal of Chemometrics 2025 citations by CoLab: 0  |  Abstract
ABSTRACTChemical separations data are typically analyzed in the time domain using methods that integrate the discrete elution bands. Integrating the same chemical components across several samples must account for retention time drift over the course of an entire experiment as the physical characteristics of the separation are altered through several cycles of use. Failure to consistently integrate the components within a matrix of samples and variables creates artifacts that have a profound effect on the analysis and interpretation of the data. This work presents an alternative where the raw separations data are analyzed in the frequency domain to account for the offset of the chromatographic peaks as a matrix of complex Fourier coefficients. We present a generalization of the factorization, permutation testing, and visualization steps in ANOVA‐simultaneous component analysis (ASCA) to handle complex matrices and use this method to analyze a synthetic dataset with known significant factors and compare the interpretation of a real dataset via its peak table and frequency domain representations.
A Greener, Safer, and More Understandable AI for Natural Science and Technology
Martens H.
Q1
Wiley
Journal of Chemometrics 2025 citations by CoLab: 0  |  Abstract
ABSTRACTMore rational, open‐minded use of quantitative Big Data in Science and Technology is required for better real‐world problem solving as well as for the stabilization of shared belief structures in society. Modern instrumentation gives informative but overwhelming data streams. A thermal video camera with suitable spatiotemporal subspace modeling allows us to detect surface temperature changes of, for example, engines, that can reveal something going on inside. An RGB video camera responds to both motions and color changes in nature, often with spatiotemporal change patterns that we can discover and describe mathematically, validate statistically, interpret graphically, and then use for sensible things. A hyperspectral Vis./NIR satellite camera with hundreds of wavelengths reveals changes in clouds and at each earth location, again and again. Today we know how to decode such overwhelming streams of high‐dimensional data into physical and chemical causalities by minimalistic hybrid multivariate subspace models. We thereby combine prior knowledge with the ability to discover new, reliable variation patterns. Minimalistic subspace models handle such data. These “open‐ended” multivariate linear hybrid models are computationally fast, statistically safe, and graphically understandable. The minimalistic subspace models are therefore suitable for both data modeling (based on multivariate measurements) and metamodeling (based on input–output simulation results for nonlinear mechanistic models' behavioral repertoire). That makes it easier to combine high‐dimensional streams of real‐world measurements and complicated, slow mechanistic models. Implemented as minimalistic foundation models with hierarchies of extended subspace models, this can form a basis for faster discovery and problem solving in Natural Science & Technology.
Evaluation of Annual Storage on Online Released Compounds in Artemisia argyi Smoke With GC‐MS‐Based Untargeted Metabolomics Coupled With Chemometric Method of AntDAS‐GCMS
Zhai M., Liu J., Wang L., Wen Y., Ma H., Liu P., Chai G., Zhang Q., Ma J., Yu Y.
Q1
Wiley
Journal of Chemometrics 2025 citations by CoLab: 0  |  Abstract
ABSTRACTArtemisia argyi smoke, generated from the combustion of A. argyi, is widely utilized in traditional Chinese medicine for moxibustion and fumigation therapies. The released smoke during the combustion of A. argyi is rich in massive compounds and can be affected by storing periods. However, there is a lack of a comprehensive understanding on the chemical composition of released smoke, and the effects of annual storage on the released smoke were still unclear. Herein, a strategy that integrated chromatography–mass spectrometry (GC‐MS) with advanced chemometric software, AntDAS‐GCMS, was developed for comprehensively characterizing tens of compound in the released smoke of A. argyi and evaluating the quality variation across annual storage periods. Both particle and gas phases of the released smoke during the combustion were collected for GC‐MS analysis; the raw data files were then imported to our recently developed data analysis software AntDAS‐GCMS for automatically retrieving underlying components. Components that show significant difference among annual storage periods were screened to provide a total of 471 components. Finally, 61 compounds were identified. Both supervised and unsupervised chemometric methods suggest that the 2‐ and 4‐year storage periods were close to clinic used sample (3‐year storage), whereas a too short storage (like 1 year) or too long storage (6 years) were quite different from the 3‐year storage samples. In conclusion, this strategy provides a novel solution for evaluating smoke samples from traditional Chinese medicine.
Stacking Ensemble Learning Method for Quantitative Analysis of Soluble Solid Content in Apples
Zhang L., Huang Z., Zhang X.
Q1
Wiley
Journal of Chemometrics 2025 citations by CoLab: 0  |  Abstract
ABSTRACTThe soluble solids content (SSC) in apples directly affects their quality. This study aimed to detect SSC nondestructively using hyperspectral technology combined with chemometrics. However, data generation may not follow a specific pattern, and even small perturbations in the data can have a significant impact on the constructed model. To improve the anti‐interference capability of individual models, this study proposed a stacking ensemble learning method that adopted partial least squares (PLS), support vector machine (SVM), extreme gradient boosting (Xgboost), random forest (RF) as basic‐learners, and RF serving as a meta‐learner. Experimental results showed that the performance of the established model on the test set were as follows: the root mean square error (RMSE) was 0.4325, mean absolute error (MAE) was 0.3245, mean absolute percentage error (MAPE) was 0.0271, coefficient of determination () was 0.9250. These results indicate that the stacking ensemble learning approach could appropriately fuse the predictive results of each basic‐learner and improve the prediction accuracy of individual models. To verify the superiority of the proposed stacking ensemble learning method, the selection of its basic‐learners, meta‐learner, and combination strategy were compared and analyzed. This study not only provides a theoretical reference for the further development of related nondestructive detection equipment but also offers guidance for fusion algorithms as well.
Robust Multivariate Dispersion Charts for Quality Control: Application to Sulfur Dioxide Monitoring
Ajadi J., Abbas N., Riaz M., Ajadi N., Salami T., Adegoke N.
Q1
Wiley
Journal of Chemometrics 2025 citations by CoLab: 0  |  Abstract
ABSTRACTThis study introduces two robust multivariate Shewhart‐type control charts based on grouped observations to detect changes in the covariance matrix, with a focus on monitoring sulfur dioxide levels during quality control processes. We compute the covariance matrix of observations, and apply the least absolute shrinkage and selection operator to penalize it in the in‐control process. Logarithms are then applied to eigenvalues derived through singular value decomposition (SVD) of the shrunken covariance matrix, ensuring robustness to non‐normality in the multivariate data. The proposed methods offer significant advantages, particularly in their ability to maintain robustness to non‐normality without relying on strict distributional assumptions. Performance comparisons using the average run length demonstrate that the proposed charts exhibit superior robustness to normality assumptions compared with existing methods. However, potential limitations include the computational complexity of the shrinkage and SVD processes, which may affect the scalability of large datasets. An application to the white wine production process illustrates the effectiveness of the proposed methods for analyzing complex multivariate chemical data. These findings indicate that the introduced charts enhance the detection of shifts in the covariance matrix of physicochemical properties, thereby improving the reliability of quality control processes in non‐normal environments. This study provides valuable tools for quality engineers and practitioners in industries dealing with multivariate analytical data, contributing to improved process monitoring and control, ensuring higher quality standards, and ensuring consistent product outcomes in fields such as food science and industrial chemistry.

Top-100

Citing journals

50
100
150
200
250
300
350
400
Show all (70 more)
50
100
150
200
250
300
350
400

Citing publishers

200
400
600
800
1000
1200
1400
1600
1800
2000
Show all (70 more)
200
400
600
800
1000
1200
1400
1600
1800
2000

Publishing organizations

20
40
60
80
100
120
Show all (70 more)
20
40
60
80
100
120

Publishing organizations in 5 years

1
2
3
4
5
6
7
Show all (49 more)
1
2
3
4
5
6
7

Publishing countries

100
200
300
400
500
600
700
Japan, 671, 12.67%
Italy, 56, 1.06%
USA, 15, 0.28%
India, 3, 0.06%
Algeria, 2, 0.04%
Sweden, 2, 0.04%
United Kingdom, 1, 0.02%
Thailand, 1, 0.02%
Turkey, 1, 0.02%
Switzerland, 1, 0.02%
100
200
300
400
500
600
700

Publishing countries in 5 years

20
40
60
80
100
120
Japan, 106, 54.36%
Italy, 5, 2.56%
USA, 3, 1.54%
Algeria, 1, 0.51%
Switzerland, 1, 0.51%
20
40
60
80
100
120