ACM Computing Surveys, volume 57, issue 8, pages 1-25

Facial Expression Analysis in Parkinson's Disease Using Machine Learning: A Review

Guilherme Camargo de Oliveira 1, 2, 3
Cuong Q Ngo 1
Leandro A. Passos 2
Danilo Samuel Jodas 2
João P. Papa 4
Dinesh K Kumar 1
1
 
engineering, RMIT University, Melbourne, Australia
2
 
São Paulo State University, Bauru, Brazil
3
 
engineering, RMIT University, Melbourne, Australia and São Paulo State University, Bauru, Brazil
4
 
Computing, São Paulo State University, Bauru, Brazil
Publication typeJournal Article
Publication date2025-03-23
scimago Q1
SJR6.280
CiteScore33.2
Impact factor23.8
ISSN03600300, 15577341
Abstract

Computerised facial expression analysis is performed for a range of social and commercial applications and more recently its potential in medicine such as to detect Parkinson’s Disease (PD) is emerging. This has possibilities for use in telehealth and population screening. The advancement of facial expression analysis using machine learning is relatively recent, with a majority of the published work being post-2019. We have performed a systematic review of the English-based publication on the topic from 2019 to 2024 to capture the trends and identify research opportunities that will facilitate the translation of this technology for recognising Parkinson’s disease. The review shows significant advancements in the field, with facial expressions emerging as a potential biomarker for PD. Different machine learning models, from shallow to deep learning, could detect PD faces. However, the main limitation is the reliance on limited datasets. Furthermore, while significant progress has been made, model generalization must be tested before clinical applications.

Munsif M., Sajjad M., Ullah M., Tarekegn A.N., Cheikh F.A., Tsakanikas P., Muhammad K.
2024-09-01 citations by CoLab: 6 Abstract  
Facial Expression Analysis (FEA) plays a vital role in diagnosing and treating early-stage neurological disorders (NDs) like Alzheimer's and Parkinson's. Manual FEA is hindered by expertise, time, and training requirements, while automatic methods confront difficulties with real patient data unavailability, high computations, and irrelevant feature extraction. To address these challenges, this paper proposes a novel approach: an efficient, lightweight convolutional block attention module (CBAM) based deep learning network (DLN) to aid doctors in diagnosing ND patients. The method comprises two stages: data collection of real ND patients, and pre-processing, involving face detection and an attention-enhanced DLN for feature extraction and refinement. Extensive experiments with validation on real patient data showcase compelling performance, achieving an accuracy of up to 73.2%. Despite its efficacy, the proposed model is lightweight, occupying only 3MB, making it suitable for deployment on resource-constrained mobile healthcare devices. Moreover, the method exhibits significant advancements over existing FEA approaches, holding tremendous promise in effectively diagnosing and treating ND patients. By accurately recognizing emotions and extracting relevant features, this approach empowers medical professionals in early ND detection and management, overcoming the challenges of manual analysis and heavy models. In conclusion, this research presents a significant leap in FEA, promising to enhance ND diagnosis and care.The code and data used in this work are available at: https://github.com/munsif200/Neurological-Health-Care.
Lv C., Fan L., Li H., Ma J., Jiang W., Ma X.
2024-09-01 citations by CoLab: 4 Abstract  
In the pursuit of enhancing PD detection, this study introduces an advanced, fully automated video-based model leveraging a comprehensive audio-visual dataset. We address the challenge of objectively capturing hypokinetic dysarthria, a key early symptom of PD, by collecting and analyzing a substantial dataset comprising audio-visual samples from 130 PD patients and 90 healthy participants. This large-scale dataset is critical in filling the existing gap in resources for PD research. Our approach utilizes a novel audio-visual fusion model, which integrates two distinct branches for extracting visual features and audio Mel-spectrogram features, both strongly correlated with PD. The integration of these features is achieved through a sophisticated Transformer-based cross-attention module, to effectively learn the complementarity between audio and visual cues. This integrated approach significantly enhances PD detection accuracy, achieving a rate of 92.68%, which surpasses the performance of conventional machine learning and deep learning models in PD diagnosis. The incorporation of visual information alongside audio data is also proven to be more effective in detecting PD than relying solely on speech signals, demonstrating the potential of our cross-attention fusion model in the broader clinical settings. This study contributes to the digital diagnosis of PD based on audio-visual features. The performance of the proposed model is verified through experiments. These results can help doctors make preliminary diagnosis of patients and even remote diagnosis.
Razzouki A.F., Jeancolas L., Mangone G., Sambin S., Chalançon A., Gomes M., Lehéricy S., Corvol J., Vidailhet M., Arnulf I., El-Yacoubi M.A., Petrovska-Delacrétaz D.
2024-07-08 citations by CoLab: 2
Huang J., Lin L., Yu F., He X., Song W., Lin J., Tang Z., Yuan K., Li Y., Huang H., Pei Z., Xian W., Yu-Chian Chen C.
2024-03-01 citations by CoLab: 5 Abstract  
The severity evaluation of Parkinson's disease (PD) is of great significance for the treatment of PD. However, existing methods either have limitations based on prior knowledge or are invasive methods. To propose a more generalized severity evaluation model, this paper proposes an explainable 3D multi-head attention residual convolution network. First, we introduce the 3D attention-based convolution layer to extract video features. Second, features will be fed into LSTM and residual backbone networks, which can be used to capture the contextual information of the video. Finally, we design a feature compression module to condense the learned contextual features. We develop some interpretable experiments to better explain this black-box model so that it can be better generalized. Experiments show that our model can achieve state-of-the-art diagnosis performance. The proposed lightweight but effective model is expected to serve as a suitable end-to-end deep learning baseline in future research on PD video-based severity evaluation and has the potential for large-scale application in PD telemedicine. The source code is available at https://github.com/JackAILab/MARNet.
Oliveira G.C., Ngo Q.C., Passos L.A., Papa J.P., Jodas D., Kumar D.
2023-10-01 citations by CoLab: 11 Abstract  
Background and Objective: This paper presents a method for the computerized detection of hypomimia in people with Parkinson’s disease (PD). It overcomes the difficulty of the small and unbalanced size of available datasets. Methods: A public dataset consisting of features of the video recordings of people with PD with four facial expressions was used. Synthetic data was generated using Conditional Generative Adversarial Network (CGAN) for training and Test-Time Augmentation was used to augment the training data. The classification was conducted using the original test set to prevent bias in the results. Results: The employment of CGAN followed by Test-Time Augmentation led to an accuracy of classification of the videos of 83%, specificity of 82%, and sensitivity of 85% in the test set that the prevalence of PD was around 7% and where real data was used for testing. This is a significant improvement compared with other similar studies. The results show that while the technique was able to detect people with PD, there were a number of false positives. Hence this is suitable for applications such as population screening or assisting clinicians, but at this stage is not suitable for diagnosis. Conclusions: This work has the potential for assisting neurologists to perform online diagnose and monitoring their patients. However, it is essential to test this for different ethnicity and to test its repeatability.
Loveleen G., Mohan B., Shikhar B.S., Nz J., Shorfuzzaman M., Masud M.
2023-09-26 citations by CoLab: 32 Abstract  
Directing research on Alzheimer’s disease toward only early prediction and accuracy cannot be considered a feasible approach toward tackling a ubiquitous degenerative disease today. Applying deep learning (DL), Explainable artificial intelligence, and advancing toward the human-computer interface (HCI) model can be a leap forward in medical research. This research aims to propose a robust explainable HCI model using SHAPley additive explanation, local interpretable model-agnostic explanations, and DL algorithms. The use of DL algorithms—logistic regression (80.87%), support vector machine (85.8%), k -nearest neighbor (87.24%), multilayer perceptron (91.94%), and decision tree (100%)—and explainability can help in exploring untapped avenues for research in medical sciences that can mold the future of HCI models. The presented model’s results show improved prediction accuracy by incorporating a user-friendly computer interface into decision-making, implying a high significance level in the context of biomedical and clinical research.
Passos L.A., Papa J.P., Hussain A., Adeel A.
Neurocomputing scimago Q1 wos Q1
2023-03-01 citations by CoLab: 13 Abstract  
Despite the recent success of machine learning algorithms, most models face drawbacks when considering more complex tasks requiring interaction between different sources, such as multimodal input data and logical time sequences. On the other hand, the biological brain is highly sharpened in this sense, empowered to automatically manage and integrate such streams of information. In this context, this work draws inspiration from recent discoveries in brain cortical circuits to propose a more biologically plausible self-supervised machine learning approach. This combines multimodal information using intra-layer modulations together with Canonical Correlation Analysis, and a memory mechanism to keep track of temporal data, the overall approach termed Canonical Cortical Graph Neural networks. This is shown to outperform recent state-of-the-art models in terms of clean audio reconstruction and energy efficiency for a benchmark audio-visual speech dataset. The enhanced performance is demonstrated through a reduced and smother neuron firing rate distribution. suggesting that the proposed model is amenable for speech enhancement in future audio-visual hearing aid devices.
Gomez L.F., Morales A., Fierrez J., Orozco-Arroyave J.R.
PLoS ONE scimago Q1 wos Q1 Open Access
2023-02-02 citations by CoLab: 13 PDF Abstract  
Background and objective Patients suffering from Parkinson’s disease (PD) present a reduction in facial movements called hypomimia. In this work, we propose to use machine learning facial expression analysis from face images based on action unit domains to improve PD detection. We propose different domain adaptation techniques to exploit the latest advances in automatic face analysis and face action unit detection. Methods Three different approaches are explored to model facial expressions of PD patients: (i) face analysis using single frame images and also using sequences of images, (ii) transfer learning from face analysis to action units recognition, and (iii) triplet-loss functions to improve the automatic classification between patients and healthy subjects. Results Real face images from PD patients show that it is possible to properly model elicited facial expressions using image sequences (neutral, onset-transition, apex, offset-transition, and neutral) with accuracy improvements of up to 5.5% (from 72.9% to 78.4%) with respect to single-image PD detection. We also show that our proposed action unit domain adaptation provides improvements of up to 8.9% (from 78.4% to 87.3%) with respect to face analysis. Finally, we also show that triplet-loss functions provide improvements of up to 3.6% (from 78.8% to 82.4%) with respect to action unit domain adaptation applied upon models created from scratch. The code of the experiments is available at https://github.com/luisf-gomez/Explorer-FE-AU-in-PD. Conclusions Domain adaptation via transfer learning methods seem to be a promising strategy to model hypomimia in PD patients. Considering the good results and also the fact that only up to five images per participant are considered in each sequence, we believe that this work is a step forward in the development of inexpensive computational systems suitable to model and quantify problems of PD patients in their facial expressions.
Lee T.K., Yankee E.L.
2022-09-08 citations by CoLab: 63 Abstract  
Parkinson’s disease (PD) is a neurodegenerative illness and has a common onset between the ages of 55 and 65 years. There is progressive development of both motor and non-motor symptoms, greatly affecting one’s overall quality of life. While there is no cure, various treatments have been developed to help manage the symptoms of PD. Management of PD is a growing field and targets new treatment methods, as well as improvements to old ones. Pharmacological, surgical, and therapeutic treatments have allowed physicians to treat not only the main motor symptoms of PD, but target patient-specific problems as they arise. This review discusses both the established and new possibilities for PD treatment that can provide patient-specific care and mitigate side effects for common treatments.
Pegolo E., Volpe D., Cucca A., Ricciardi L., Sawacha Z.
Sensors scimago Q1 wos Q2 Open Access
2022-02-10 citations by CoLab: 11 PDF Abstract  
Parkinson’s disease (PD) is a neurological disorder that mainly affects the motor system. Among other symptoms, hypomimia is considered one of the clinical hallmarks of the disease. Despite its great impact on patients’ quality of life, it remains still under-investigated. The aim of this work is to provide a quantitative index for hypomimia that can distinguish pathological and healthy subjects and that can be used in the classification of emotions. A face tracking algorithm was implemented based on the Facial Action Coding System. A new easy-to-interpret metric (face mobility index, FMI) was defined considering distances between pairs of geometric features and a classification based on this metric was proposed. Comparison was also provided between healthy controls and PD patients. Results of the study suggest that this index can quantify the degree of impairment in PD and can be used in the classification of emotions. Statistically significant differences were observed for all emotions when distances were taken into account, and for happiness and anger when FMI was considered. The best classification results were obtained with Random Forest and kNN according to the AUC metric.
Jakubowski J., Potulska-Chromik A., Białek K., Nojszewska M., Kostera-Pruszczyk A.
Electronics (Switzerland) scimago Q2 wos Q2 Open Access
2021-11-18 citations by CoLab: 8 PDF Abstract  
One of the symptoms of Parkinson’s disease is the occurrence of problems with the expression of emotions on the face, called facial masking, facial bradykinesia or hypomimia. Recent medical studies show that this symptom can be used in the diagnosis of this disease. In the presented study, the authors, on the basis of their own research, try to answer the question of whether it is possible to build an automatic Parkinson’s disease recognition system based on the face image. The research used image recordings in the field of visible light and infrared. The material for the study consisted of registrations in a group of patients with Parkinson’s disease and a group of healthy patients. The patients were asked to express a neutral facial expression and a smile. In the detection, both geometric and holistic methods based on the use of convolutional network and image fusion were used. The obtained results were assessed quantitatively using statistical measures, including F1score, which was a value of 0.941. The results were compared with a competitive work on the same subject. A novelty of our experiments is that patients with Parkinson’s disease were in the so-called ON phase, in which, due to the action of drugs, the symptoms of the disease are reduced. The results obtained seem to be useful in the process of early diagnosis of this disease, especially in times of remote medical examination.
Tjoa E., Guan C.
2021-11-01 citations by CoLab: 1081 Abstract  
Recently, artificial intelligence and machine learning in general have demonstrated remarkable performances in many tasks, from image processing to natural language processing, especially with the advent of deep learning (DL). Along with research progress, they have encroached upon many different fields and disciplines. Some of them require high level of accountability and thus transparency, for example, the medical sector. Explanations for machine decisions and predictions are thus needed to justify their reliability. This requires greater interpretability, which often means we need to understand the mechanism underlying the algorithms. Unfortunately, the blackbox nature of the DL is still unresolved, and many machine decisions are still poorly understood. We provide a review on interpretabilities suggested by different research works and categorize them. The different categories show different dimensions in interpretability research, from approaches that provide “obviously” interpretable information to the studies of complex patterns. By applying the same categorization to interpretability in medical research, it is hoped that: 1) clinicians and practitioners can subsequently approach these methods with caution; 2) insight into interpretability will be born with more considerations for medical practices; and 3) initiatives to push forward data-based, mathematically grounded, and technically grounded medical education are encouraged.
Su G., Lin B., Luo W., Yin J., Deng S., Gao H., Xu R.
2021-10-26 citations by CoLab: 11 Abstract  
Parkinson’s disease is the second most common neurodegenerative disorder, commonly affecting elderly people over the age of 65. As the cardinal manifestation, hypomimia, referred to as impairments in normal facial expressions, stays covert. Even some experienced doctors may miss these subtle changes, especially in a mild stage of this disease. The existing methods for hypomimia recognition are mainly dominated by statistical variable-based methods with the help of traditional machine learning algorithms. Despite the success of recognizing hypomimia, they show a limited accuracy and lack the capability of performing semantic analysis. Therefore, developing a computer-aided diagnostic method for semantically recognizing hypomimia is appealing. In this article, we propose a Semantic Feature based Hypomimia Recognition network , named SFHR-NET , to recognize hypomimia based on facial videos. First, a Semantic Feature Classifier (SF-C) is proposed to adaptively adjust feature maps salient to hypomimia, which leads the encoder and classifier to focus more on areas of hypomimia-interest. In SF-C, the progressive confidence strategy (PCS) ensures more reliable semantic features. Then, a two-stream framework is introduced to fuse the spatial data stream and temporal optical stream, which allows the encoder to semantically and progressively characterize the rigid process of hypomimia. Finally, to improve the interpretability of the model, Gradient-weighted Class Activation Mapping (Grad-CAM) is integrated to generate attention maps that cast our engineered features into hypomimia-interest regions. These highlighted regions provide visual explanations for decisions of our network. Experimental results based on real-world data demonstrate the effectiveness of our method in detecting hypomimia.
  • We do not take into account publications without a DOI.
  • Statistics recalculated only for publications connected to researchers, organizations and labs registered on the platform.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Share
Cite this
GOST | RIS | BibTex | MLA
Found error?