volume 71 pages 103107

MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech

Emna Rejaibi 1, 2
Ali Komaty 3
F. Meriaudeau 4
Said Agrebi 5
Alice Othmani 2
2
 
Université Paris-Est, LISSI, UPEC, 94400 Vitry sur Seine, France
3
 
University of Sciences and Arts in Lebanon, Ghobeiry, Lebanon
5
 
Yobitrust, Technopark El Gazala B11 Route de Raoued Km 3.5, 2088 Ariana, Tunisia
Publication typeJournal Article
Publication date2022-01-01
scimago Q1
wos Q2
SJR1.229
CiteScore11.5
Impact factor4.9
ISSN17468094, 17468108
Signal Processing
Health Informatics
Abstract
• A deep Recurrent Neural Network based framework for depression recognition from speech. • A robust approach that outperforms the state-of-art approaches on DAIC-WOZ dataset. • Fast, non-invasive and non-intruded approach, convenient for real-world applications. • Expanding training labels and transferred features to overcome data scarcity. • Evaluation of the proposed approach under multi-modal and a multi-features experiments. Clinical depression or Major Depressive Disorder (MDD) is a common and serious medical illness. In this paper, a deep Recurrent Neural Network-based framework is presented to detect depression and to predict its severity level from speech. Low-level and high-level audio features are extracted from audio recordings to predict the 24 scores of the Patient Health Questionnaire and the binary class of depression diagnosis. To overcome the problem of the small size of Speech Depression Recognition (SDR) datasets, expanding training labels and transferred features are considered. The proposed approach outperforms the state-of-art approaches on the DAIC-WOZ database with an overall accuracy of 76.27% and a root mean square error of 0.4 in assessing depression, while a root mean square error of 0.168 is achieved in predicting the depression severity levels. The proposed framework has several advantages (fastness, non-invasiveness, and non-intrusion), which makes it convenient for real-time applications. The performances of the proposed approach are evaluated under a multi-modal and a multi-features experiments. MFCC based high-level features hold relevant information related to depression. Yet, adding visual action units and different other acoustic features further boosts the classification results by 20% and 10% to reach an accuracy of 95.6% and 86%, respectively. Considering visual-facial modality needs to be carefully studied as it sparks patient privacy concerns while adding more acoustic features increases the computation time.
Found 
Found 

Top-30

Journals

2
4
6
8
10
12
Biomedical Signal Processing and Control
11 publications, 5.79%
Journal of Affective Disorders
6 publications, 3.16%
Lecture Notes in Computer Science
6 publications, 3.16%
IEEE Transactions on Affective Computing
5 publications, 2.63%
Multimedia Tools and Applications
4 publications, 2.11%
Scientific Reports
4 publications, 2.11%
Computers in Biology and Medicine
3 publications, 1.58%
Computer Methods and Programs in Biomedicine
3 publications, 1.58%
IEEE Journal of Biomedical and Health Informatics
3 publications, 1.58%
IEEE Access
3 publications, 1.58%
Communications in Computer and Information Science
3 publications, 1.58%
Frontiers in Digital Health
2 publications, 1.05%
Digital Signal Processing: A Review Journal
2 publications, 1.05%
Knowledge-Based Systems
2 publications, 1.05%
Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
2 publications, 1.05%
Computer Speech and Language
2 publications, 1.05%
Applied Sciences (Switzerland)
2 publications, 1.05%
Information Fusion
2 publications, 1.05%
Lecture Notes in Electrical Engineering
2 publications, 1.05%
Heliyon
2 publications, 1.05%
Engineering Applications of Artificial Intelligence
2 publications, 1.05%
Applied Acoustics
2 publications, 1.05%
Speech Communication
2 publications, 1.05%
Sensors
2 publications, 1.05%
Frontiers in Psychology
2 publications, 1.05%
Diagnostics
1 publication, 0.53%
Artificial Life and Robotics
1 publication, 0.53%
BioMedInformatics
1 publication, 0.53%
Wireless Personal Communications
1 publication, 0.53%
2
4
6
8
10
12

Publishers

10
20
30
40
50
60
Institute of Electrical and Electronics Engineers (IEEE)
60 publications, 31.58%
Elsevier
53 publications, 27.89%
Springer Nature
34 publications, 17.89%
MDPI
11 publications, 5.79%
Frontiers Media S.A.
7 publications, 3.68%
Association for Computing Machinery (ACM)
5 publications, 2.63%
JMIR Publications
4 publications, 2.11%
Wiley
4 publications, 2.11%
SAGE
1 publication, 0.53%
Hindawi Limited
1 publication, 0.53%
Centre for Evaluation in Education and Science (CEON/CEES)
1 publication, 0.53%
XMLink
1 publication, 0.53%
Taylor & Francis
1 publication, 0.53%
IGI Global
1 publication, 0.53%
Oxford University Press
1 publication, 0.53%
SPIE-Intl Soc Optical Eng
1 publication, 0.53%
IMR Press
1 publication, 0.53%
FSFEI HE Don State Technical University
1 publication, 0.53%
Hogrefe Publishing Group
1 publication, 0.53%
10
20
30
40
50
60
  • We do not take into account publications without a DOI.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
190
Share
Cite this
GOST |
Cite this
GOST Copy
Rejaibi E. et al. MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech // Biomedical Signal Processing and Control. 2022. Vol. 71. p. 103107.
GOST all authors (up to 50) Copy
Rejaibi E., Komaty A., Meriaudeau F., Agrebi S., Othmani A. MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech // Biomedical Signal Processing and Control. 2022. Vol. 71. p. 103107.
RIS |
Cite this
RIS Copy
TY - JOUR
DO - 10.1016/j.bspc.2021.103107
UR - https://doi.org/10.1016/j.bspc.2021.103107
TI - MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech
T2 - Biomedical Signal Processing and Control
AU - Rejaibi, Emna
AU - Komaty, Ali
AU - Meriaudeau, F.
AU - Agrebi, Said
AU - Othmani, Alice
PY - 2022
DA - 2022/01/01
PB - Elsevier
SP - 103107
VL - 71
SN - 1746-8094
SN - 1746-8108
ER -
BibTex
Cite this
BibTex (up to 50 authors) Copy
@article{2022_Rejaibi,
author = {Emna Rejaibi and Ali Komaty and F. Meriaudeau and Said Agrebi and Alice Othmani},
title = {MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech},
journal = {Biomedical Signal Processing and Control},
year = {2022},
volume = {71},
publisher = {Elsevier},
month = {jan},
url = {https://doi.org/10.1016/j.bspc.2021.103107},
pages = {103107},
doi = {10.1016/j.bspc.2021.103107}
}