Facial Expression Recognition via Collecting Large-scale Unlabeled Video Data

Cho, Yun Seong

Scholarworks@UNIST

UNIST Library

File Download

200000667121.pdf

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Facial Expression Recognition via Collecting Large-scale Unlabeled Video Data

Alternative Title: 레이블이 지정되지 않은 대규모 비디오 데이터 수집을 통한 표정 인식

Author(s): Cho, Yun Seong

Advisor: Baek, Seungryul

Issued Date: 2023-02

URI: https://scholarworks.unist.ac.kr/handle/201301/74030 http://unist.dcollection.net/common/orgView/200000667121

Abstract: Facial expression recognition (FER) that classifies facial expressions from input images has been considerably advanced thanks to deep learning. While deep learning requires large-scale datasets; collecting data with an accurate annotation is challenging in FER since accurately discretizing expression is difficult due to the subtlety and complexity of facial expressions and the subjectivity of the annotator. In this paper, we collected a large-scale reaction mashup (RM) videos (without expression annotations) from the YouTube that involves multiple persons’ facial reactions when watching the same film. Based on this, we propose a novel contrastive learning framework for FER, which is composed of two stages: inter-sample attention learning (IAL) and attention-based contrastive learning (ACL) stages: In IAL, we learn the baseline FER network and train the expression similarity of sample pairs based on the benchmark dataset and its discretized expression annotation. In ACL, we apply the contrastive learning on collected RM videos using priors combined with the learned expression similarities: Given the anchor face, different persons’ faces in nearby frames that exhibit high similarity are used as the positive samples; while the same person’s faces in the distant frames that exhibit low similarity are used as the negative samples. Experimental results showed that the proposed method effectively improves the distribution of learned features reflecting continuous variations of facial expressions and thereby outperforming previous state-of-the-arts in three FER benchmark datasets (i.e. AffectNet, RAF-DB and FERPlus datasets).

Publisher: Ulsan National Institute of Science and Technology (UNIST)

Degree: Master

Major: Graduate School of Artificial Intelligence

Show Full Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.