File Download

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Full metadata record

DC Field Value Language
dc.contributor.advisor Baek, Seungryul -
dc.contributor.author LEE, SEONGYEONG -
dc.date.accessioned 2024-01-29T15:39:23Z -
dc.date.available 2024-01-29T15:39:23Z -
dc.date.issued 2022-08 -
dc.description.abstract RGB-based 3D hand pose estimation has been successful for decades thanks to large databases and deep learning. However, the domain gap between datasets is not yet complete. For various reasons such as the distribution of background features and the shape of the hand, generalization may not be good when testing on a target domain different from the dataset used for training. To solve this problem, many existing papers tried to reduce the domain gap and improve the performance of 3D hand pose estimation by using the domain adaptation method to generalize well by using additional unconstrained/target domain images. However, in an environment where other additional images are difficult to obtain, there is still a need to solve the problem to reduce the domain gap. In this paper, we present a pose estimation framework that uses only source domain data and uses features to generalize well to many unseen data domains. The core of our algorithm is that we achieve generalization effects in different domains using the image encoder of the CLIP(Contrastive Language-Image Pre-Training), which performs well in image representation by contrastive learning with a large number of image-text pairs. In addition, by using a text encoder to effect additional augmentation, general performance was improved for images that were not seen during training. Consequently, in experiments with two datasets STB and RHD representing challenging real-world scenarios, our algorithm shows improved performance that outperform and better generalized compared to the state-of-the-art. -
dc.description.degree Master -
dc.description Department of Computer Science and Engineering -
dc.identifier.uri https://scholarworks.unist.ac.kr/handle/201301/73841 -
dc.identifier.uri http://unist.dcollection.net/common/orgView/200000627563 -
dc.language eng -
dc.publisher Ulsan National Institute of Science and Technology (UNIST) -
dc.rights.embargoReleaseTerms 9999-12-31 -
dc.title.alternative 3D 손 포즈 추정을 위한 CLIP을 통한 이미지 없는 도메인 일반화 -
dc.title Image-free domain generalization via CLIP for 3D hand pose estimation -
dc.type Thesis -

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.