File Download

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Facial Expression Recognition via Collecting Large-scale Unlabeled Video Data

Alternative Title
레이블이 지정되지 않은 대규모 비디오 데이터 수집을 통한 표정 인식
Author(s)
Cho, Yun Seong
Advisor
Baek, Seungryul
Issued Date
2023-02
URI
https://scholarworks.unist.ac.kr/handle/201301/74030 http://unist.dcollection.net/common/orgView/200000667121
Abstract
Facial expression recognition (FER) that classifies facial expressions from input images has been considerably advanced thanks to deep learning. While deep learning requires large-scale datasets; collecting data with an accurate annotation is challenging in FER since accurately discretizing expression is difficult due to the subtlety and complexity of facial expressions and the subjectivity of the annotator. In this paper, we collected a large-scale reaction mashup (RM) videos (without expression annotations) from the YouTube that involves multiple persons’ facial reactions when watching the same film. Based on this, we propose a novel contrastive learning framework for FER, which is composed of two stages: inter-sample attention learning (IAL) and attention-based contrastive learning (ACL) stages: In IAL, we learn the baseline FER network and train the expression similarity of sample pairs based on the benchmark dataset and its discretized expression annotation. In ACL, we apply the contrastive learning on collected RM videos using priors combined with the learned expression similarities: Given the anchor face, different persons’ faces in nearby frames that exhibit high similarity are used as the positive samples; while the same person’s faces in the distant frames that exhibit low similarity are used as the negative samples. Experimental results showed that the proposed method effectively improves the distribution of learned features reflecting continuous variations of facial expressions and thereby outperforming previous state-of-the-arts in three FER benchmark datasets (i.e. AffectNet, RAF-DB and FERPlus datasets).
Publisher
Ulsan National Institute of Science and Technology (UNIST)
Degree
Master
Major
Graduate School of Artificial Intelligence

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.