Intention-Conditioned Long-Term Human Egocentric Action Anticipation

Mascaro, Esteve Valls; Ahn, Hyemin; Lee, Dongheui

doi:10.1109/WACV56688.2023.00599

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Related Researcher

안혜민

Ahn, Hyemin

Read More

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.citation.conferencePlace	US	-
dc.citation.endPage	6046	-
dc.citation.startPage	6037	-
dc.citation.title	Workshop on Applications of Computer Vision	-
dc.contributor.author	Mascaro, Esteve Valls	-
dc.contributor.author	Ahn, Hyemin	-
dc.contributor.author	Lee, Dongheui	-
dc.date.accessioned	2024-01-28T10:05:08Z	-
dc.date.available	2024-01-28T10:05:08Z	-
dc.date.created	2023-12-01	-
dc.date.issued	2023-01-05	-
dc.description.abstract	To anticipate how a person would act in the future, it is essential to understand the human intention since it guides the subject towards a certain action. In this paper, we propose a hierarchical architecture which assumes a sequence of human action (low-level) can be driven from the human intention (high-level). Based on this, we deal with long-term action anticipation task in egocentric videos. Our framework first extracts this low- and high-level human information over the observed human actions in a video through a Hierarchical Multi-task Multi-Layer Perceptrons Mixer (H3M). Then, we constrain the uncertainty of the future through an Intention-Conditioned Variational Auto-Encoder (I-CVAE) that generates multiple stable predictions of the next actions that the observed human might perform. By leveraging human intention as high-level information, we claim that our model is able to anticipate more time-consistent actions in the long-term, thus improving the results over the baseline in Ego4D dataset. This work results in the state-of-the-art for Long-Term Anticipation (LTA) task in Ego4D by providing more plausible anticipated sequences, improving the anticipation scores of nouns and actions. Our work ranked first in both CVPR@2022 and ECCV@2022 Ego4D LTA Challenge.	-
dc.identifier.bibliographicCitation	Workshop on Applications of Computer Vision, pp.6037 - 6046	-
dc.identifier.doi	10.1109/WACV56688.2023.00599	-
dc.identifier.scopusid	2-s2.0-85149003624	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/72429	-
dc.language	영어	-
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	-
dc.title	Intention-Conditioned Long-Term Human Egocentric Action Anticipation	-
dc.type	Conference Paper	-
dc.date.conferenceDate	2023-01-03	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1404 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.