MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech
2
Université Paris-Est, LISSI, UPEC, 94400 Vitry sur Seine, France
|
3
University of Sciences and Arts in Lebanon, Ghobeiry, Lebanon
|
5
Yobitrust, Technopark El Gazala B11 Route de Raoued Km 3.5, 2088 Ariana, Tunisia
|
Publication type: Journal Article
Publication date: 2022-01-01
scimago Q1
wos Q2
SJR: 1.229
CiteScore: 11.5
Impact factor: 4.9
ISSN: 17468094, 17468108
Signal Processing
Health Informatics
Abstract
• A deep Recurrent Neural Network based framework for depression recognition from speech. • A robust approach that outperforms the state-of-art approaches on DAIC-WOZ dataset. • Fast, non-invasive and non-intruded approach, convenient for real-world applications. • Expanding training labels and transferred features to overcome data scarcity. • Evaluation of the proposed approach under multi-modal and a multi-features experiments. Clinical depression or Major Depressive Disorder (MDD) is a common and serious medical illness. In this paper, a deep Recurrent Neural Network-based framework is presented to detect depression and to predict its severity level from speech. Low-level and high-level audio features are extracted from audio recordings to predict the 24 scores of the Patient Health Questionnaire and the binary class of depression diagnosis. To overcome the problem of the small size of Speech Depression Recognition (SDR) datasets, expanding training labels and transferred features are considered. The proposed approach outperforms the state-of-art approaches on the DAIC-WOZ database with an overall accuracy of 76.27% and a root mean square error of 0.4 in assessing depression, while a root mean square error of 0.168 is achieved in predicting the depression severity levels. The proposed framework has several advantages (fastness, non-invasiveness, and non-intrusion), which makes it convenient for real-time applications. The performances of the proposed approach are evaluated under a multi-modal and a multi-features experiments. MFCC based high-level features hold relevant information related to depression. Yet, adding visual action units and different other acoustic features further boosts the classification results by 20% and 10% to reach an accuracy of 95.6% and 86%, respectively. Considering visual-facial modality needs to be carefully studied as it sparks patient privacy concerns while adding more acoustic features increases the computation time.
Found
Nothing found, try to update filter.
Found
Nothing found, try to update filter.
Top-30
Journals
|
2
4
6
8
10
12
|
|
|
Biomedical Signal Processing and Control
11 publications, 5.79%
|
|
|
Journal of Affective Disorders
6 publications, 3.16%
|
|
|
Lecture Notes in Computer Science
6 publications, 3.16%
|
|
|
IEEE Transactions on Affective Computing
5 publications, 2.63%
|
|
|
Multimedia Tools and Applications
4 publications, 2.11%
|
|
|
Scientific Reports
4 publications, 2.11%
|
|
|
Computers in Biology and Medicine
3 publications, 1.58%
|
|
|
Computer Methods and Programs in Biomedicine
3 publications, 1.58%
|
|
|
IEEE Journal of Biomedical and Health Informatics
3 publications, 1.58%
|
|
|
IEEE Access
3 publications, 1.58%
|
|
|
Communications in Computer and Information Science
3 publications, 1.58%
|
|
|
Frontiers in Digital Health
2 publications, 1.05%
|
|
|
Digital Signal Processing: A Review Journal
2 publications, 1.05%
|
|
|
Knowledge-Based Systems
2 publications, 1.05%
|
|
|
Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
2 publications, 1.05%
|
|
|
Computer Speech and Language
2 publications, 1.05%
|
|
|
Applied Sciences (Switzerland)
2 publications, 1.05%
|
|
|
Information Fusion
2 publications, 1.05%
|
|
|
Lecture Notes in Electrical Engineering
2 publications, 1.05%
|
|
|
Heliyon
2 publications, 1.05%
|
|
|
Engineering Applications of Artificial Intelligence
2 publications, 1.05%
|
|
|
Applied Acoustics
2 publications, 1.05%
|
|
|
Speech Communication
2 publications, 1.05%
|
|
|
Sensors
2 publications, 1.05%
|
|
|
Frontiers in Psychology
2 publications, 1.05%
|
|
|
Diagnostics
1 publication, 0.53%
|
|
|
Artificial Life and Robotics
1 publication, 0.53%
|
|
|
BioMedInformatics
1 publication, 0.53%
|
|
|
Wireless Personal Communications
1 publication, 0.53%
|
|
|
2
4
6
8
10
12
|
Publishers
|
10
20
30
40
50
60
|
|
|
Institute of Electrical and Electronics Engineers (IEEE)
60 publications, 31.58%
|
|
|
Elsevier
53 publications, 27.89%
|
|
|
Springer Nature
34 publications, 17.89%
|
|
|
MDPI
11 publications, 5.79%
|
|
|
Frontiers Media S.A.
7 publications, 3.68%
|
|
|
Association for Computing Machinery (ACM)
5 publications, 2.63%
|
|
|
JMIR Publications
4 publications, 2.11%
|
|
|
Wiley
4 publications, 2.11%
|
|
|
SAGE
1 publication, 0.53%
|
|
|
Hindawi Limited
1 publication, 0.53%
|
|
|
Centre for Evaluation in Education and Science (CEON/CEES)
1 publication, 0.53%
|
|
|
XMLink
1 publication, 0.53%
|
|
|
Taylor & Francis
1 publication, 0.53%
|
|
|
IGI Global
1 publication, 0.53%
|
|
|
Oxford University Press
1 publication, 0.53%
|
|
|
SPIE-Intl Soc Optical Eng
1 publication, 0.53%
|
|
|
IMR Press
1 publication, 0.53%
|
|
|
FSFEI HE Don State Technical University
1 publication, 0.53%
|
|
|
Hogrefe Publishing Group
1 publication, 0.53%
|
|
|
10
20
30
40
50
60
|
- We do not take into account publications without a DOI.
- Statistics recalculated weekly.
Are you a researcher?
Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
190
Total citations:
190
Citations from 2024:
121
(63.69%)
Cite this
GOST |
RIS |
BibTex
Cite this
GOST
Copy
Rejaibi E. et al. MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech // Biomedical Signal Processing and Control. 2022. Vol. 71. p. 103107.
GOST all authors (up to 50)
Copy
Rejaibi E., Komaty A., Meriaudeau F., Agrebi S., Othmani A. MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech // Biomedical Signal Processing and Control. 2022. Vol. 71. p. 103107.
Cite this
RIS
Copy
TY - JOUR
DO - 10.1016/j.bspc.2021.103107
UR - https://doi.org/10.1016/j.bspc.2021.103107
TI - MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech
T2 - Biomedical Signal Processing and Control
AU - Rejaibi, Emna
AU - Komaty, Ali
AU - Meriaudeau, F.
AU - Agrebi, Said
AU - Othmani, Alice
PY - 2022
DA - 2022/01/01
PB - Elsevier
SP - 103107
VL - 71
SN - 1746-8094
SN - 1746-8108
ER -
Cite this
BibTex (up to 50 authors)
Copy
@article{2022_Rejaibi,
author = {Emna Rejaibi and Ali Komaty and F. Meriaudeau and Said Agrebi and Alice Othmani},
title = {MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech},
journal = {Biomedical Signal Processing and Control},
year = {2022},
volume = {71},
publisher = {Elsevier},
month = {jan},
url = {https://doi.org/10.1016/j.bspc.2021.103107},
pages = {103107},
doi = {10.1016/j.bspc.2021.103107}
}