Image and Vision Computing, volume 81, pages 1-14

Learning facial action units with spatiotemporal cues and multi-label sampling

Wen Sheng Chu ¹

Fernando De La Torre ¹

Jeffrey S. Cohn ²

Hide authors affiliations Show authors affiliations: 2 affiliations

Robotics Institute, Carnegie-Mellon University, Pittsburgh, USA |

Department of Psychology, University of Pittsburgh , Pittsburgh, USA |

Publication type: Journal Article

Publication date: 2019-01-01

Elsevier

Journal: Image and Vision Computing

SJR: 1.204

CiteScore: 8.5

Impact factor: 4.2

ISSN: 02628856, 18728138

DOI: 10.1016/j.imavis.2018.10.002

Copy DOI

Signal Processing

Computer Vision and Pattern Recognition

Abstract

Facial action units (AUs) may be represented spatially, temporally, and in terms of their correlation. Previous research focuses on one or another of these aspects or addresses them disjointly. We propose a hybrid network architecture that jointly models spatial and temporal representations and their correlation. In particular, we use a Convolutional Neural Network (CNN) to learn spatial representations, and a Long Short-Term Memory (LSTM) to model temporal dependencies among them. The outputs of CNNs and LSTMs are aggregated into a fusion network to produce per-frame prediction of multiple AUs. The hybrid network was compared to previous state-of-the-art approaches in two large FACS-coded video databases, GFT and BP4D, with over 400,000 AU-coded frames of spontaneous facial behavior in varied social contexts. Relative to standard multi-label CNN and feature-based state-of-the-art approaches, the hybrid system reduced person-specific biases and obtained increased accuracy for AU detection. To address class imbalance within and between batches during training the network, we introduce multi-labeling sampling strategies that further increase accuracy when AUs are relatively sparse. Finally, we provide visualization of the learned AU models, which, to the best of our best knowledge, reveal for the first time how machines see AUs.

Found

Top-30

Journals

	1
ACM Transactions on Computing for Healthcare	ACM Transactions on Computing for Healthcare, 1, 7.14% ACM Transactions on Computing for Healthcare 1 publication, 7.14%
Frontiers in Computer Science	Frontiers in Computer Science, 1, 7.14% Frontiers in Computer Science 1 publication, 7.14%
Frontiers in Signal Processing	Frontiers in Signal Processing, 1, 7.14% Frontiers in Signal Processing 1 publication, 7.14%
Pattern Analysis and Applications	Pattern Analysis and Applications, 1, 7.14% Pattern Analysis and Applications 1 publication, 7.14%
Future Generation Computer Systems	Future Generation Computer Systems, 1, 7.14% Future Generation Computer Systems 1 publication, 7.14%
IEEE Transactions on Affective Computing	IEEE Transactions on Affective Computing, 1, 7.14% IEEE Transactions on Affective Computing 1 publication, 7.14%
IEEE Access	IEEE Access, 1, 7.14% IEEE Access 1 publication, 7.14%
Lecture Notes in Computer Science	Lecture Notes in Computer Science, 1, 7.14% Lecture Notes in Computer Science 1 publication, 7.14%
IEEE Transactions on Image Processing	IEEE Transactions on Image Processing, 1, 7.14% IEEE Transactions on Image Processing 1 publication, 7.14%
	1

Publishers

	1 2 3
Institute of Electrical and Electronics Engineers (IEEE)	Institute of Electrical and Electronics Engineers (IEEE), 3, 21.43% Institute of Electrical and Electronics Engineers (IEEE) 3 publications, 21.43%
Frontiers Media S.A.	Frontiers Media S.A., 2, 14.29% Frontiers Media S.A. 2 publications, 14.29%
Springer Nature	Springer Nature, 2, 14.29% Springer Nature 2 publications, 14.29%
Association for Computing Machinery (ACM)	Association for Computing Machinery (ACM), 1, 7.14% Association for Computing Machinery (ACM) 1 publication, 7.14%
Elsevier	Elsevier, 1, 7.14% Elsevier 1 publication, 7.14%
	1 2 3

We do not take into account publications without a DOI.
Statistics recalculated only for publications connected to researchers, organizations and labs registered on the platform.
Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.

Metrics

Cite this

GOST |

Cite this

GOST Copy

Chu W. S., Torre F. D. L., Cohn J. S. Learning facial action units with spatiotemporal cues and multi-label sampling // Image and Vision Computing. 2019. Vol. 81. pp. 1-14.

GOST all authors (up to 50) Copy

Chu W. S., Torre F. D. L., Cohn J. S. Learning facial action units with spatiotemporal cues and multi-label sampling // Image and Vision Computing. 2019. Vol. 81. pp. 1-14.

RIS |

Cite this

RIS Copy

TY - JOUR

DO - 10.1016/j.imavis.2018.10.002

UR - https://doi.org/10.1016/j.imavis.2018.10.002

TI - Learning facial action units with spatiotemporal cues and multi-label sampling

T2 - Image and Vision Computing

AU - Chu, Wen Sheng

AU - Torre, Fernando De La

AU - Cohn, Jeffrey S.

PY - 2019

DA - 2019/01/01

PB - Elsevier

SP - 1-14

VL - 81

SN - 0262-8856

SN - 1872-8138

ER -

BibTex

Cite this

BibTex (up to 50 authors) Copy

@article{2019_Chu,

author = {Wen Sheng Chu and Fernando De La Torre and Jeffrey S. Cohn},

title = {Learning facial action units with spatiotemporal cues and multi-label sampling},

journal = {Image and Vision Computing},

year = {2019},

volume = {81},

publisher = {Elsevier},

month = {jan},

url = {https://doi.org/10.1016/j.imavis.2018.10.002},

pages = {1--14},

doi = {10.1016/j.imavis.2018.10.002}

}

Found error?

Publisher

Elsevier

Journal

Image and Vision Computing

SJR

1.204

CiteScore

8.5

Impact factor

4.2

ISSN

02628856 (Print)

18728138