File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

백승렬

Baek, Seungryul
UNIST VISION AND LEARNING LAB.
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Full metadata record

DC Field Value Language
dc.citation.endPage 67550 -
dc.citation.startPage 67541 -
dc.citation.title IEEE ACCESS -
dc.citation.volume 10 -
dc.contributor.author Cha, Junuk -
dc.contributor.author Saqlain, Muhammad -
dc.contributor.author Kim, Donguk -
dc.contributor.author Lee, Seungeun -
dc.contributor.author Lee, Seongyeong -
dc.contributor.author Baek, Seungryul -
dc.date.accessioned 2023-12-21T14:07:54Z -
dc.date.available 2023-12-21T14:07:54Z -
dc.date.created 2022-07-18 -
dc.date.issued 2022-06 -
dc.description.abstract Skeleton-based human action recognition has attracted significant interest due to its simplicity and good accuracy. Diverse end-to-end trainable frameworks based on skeletal representation have been proposed so far to map the representation to human action classes better. Most skeleton-based human action recognition approaches are based on the skeletons, which are heuristically pre-defined by the commercial sensors. Nevertheless, it is not confirmed whether the sensor-captured skeletons are the best representation of human bodies for the action recognition task, while in general, the dedicated representation is required for achieving the successful performance on subsequent tasks such as action recognition. In this paper, we try to deal with the issue by explicitly learning the skeletal representation in the context of the human action recognition task. We start our investigation by reconstructing 3D meshes of the human bodies from RGB videos. Then we involve the transformer architecture to sample the most informative skeletal representation from reconstructed 3D meshes, considering the inner and inter structural relationship of 3D meshes and sensor-captured skeletons. Experimental results on challenging human action recognition benchmarks (i.e., SYSU and UTD-MHAD datasets) have shown the superiority of our skeletal representation compared to the sensor-captured skeletons for the action recognition task. -
dc.identifier.bibliographicCitation IEEE ACCESS, v.10, pp.67541 - 67550 -
dc.identifier.doi 10.1109/ACCESS.2022.3185058 -
dc.identifier.issn 2169-3536 -
dc.identifier.uri https://scholarworks.unist.ac.kr/handle/201301/58881 -
dc.identifier.wosid 000819824400001 -
dc.language 영어 -
dc.publisher IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC -
dc.title Learning 3D Skeletal Representation From Transformer for Action Recognition -
dc.type Article -
dc.description.isOpenAccess TRUE -
dc.relation.journalWebOfScienceCategory Computer Science, Information Systems; Engineering, Electrical & Electronic; Telecommunications -
dc.relation.journalResearchArea Computer Science; Engineering; Telecommunications -
dc.type.docType Article -
dc.description.journalRegisteredClass scie -
dc.description.journalRegisteredClass scopus -
dc.subject.keywordAuthor Three-dimensional displays -
dc.subject.keywordAuthor Skeleton -
dc.subject.keywordAuthor Transformers -
dc.subject.keywordAuthor Task analysis -
dc.subject.keywordAuthor Image reconstruction -
dc.subject.keywordAuthor Videos -
dc.subject.keywordAuthor Training -
dc.subject.keywordAuthor 3D representation -
dc.subject.keywordAuthor action recognition -
dc.subject.keywordAuthor human mesh -
dc.subject.keywordAuthor transformer -

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.