Image-free domain generalization via CLIP for 3D hand pose estimation

LEE, SEONGYEONG

Scholarworks@UNIST

UNIST Library

File Download

200000627563.pdf

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Baek, Seungryul	-
dc.contributor.author	LEE, SEONGYEONG	-
dc.date.accessioned	2024-01-29T15:39:23Z	-
dc.date.available	2024-01-29T15:39:23Z	-
dc.date.issued	2022-08	-
dc.description.abstract	RGB-based 3D hand pose estimation has been successful for decades thanks to large databases and deep learning. However, the domain gap between datasets is not yet complete. For various reasons such as the distribution of background features and the shape of the hand, generalization may not be good when testing on a target domain different from the dataset used for training. To solve this problem, many existing papers tried to reduce the domain gap and improve the performance of 3D hand pose estimation by using the domain adaptation method to generalize well by using additional unconstrained/target domain images. However, in an environment where other additional images are difficult to obtain, there is still a need to solve the problem to reduce the domain gap. In this paper, we present a pose estimation framework that uses only source domain data and uses features to generalize well to many unseen data domains. The core of our algorithm is that we achieve generalization effects in different domains using the image encoder of the CLIP(Contrastive Language-Image Pre-Training), which performs well in image representation by contrastive learning with a large number of image-text pairs. In addition, by using a text encoder to effect additional augmentation, general performance was improved for images that were not seen during training. Consequently, in experiments with two datasets STB and RHD representing challenging real-world scenarios, our algorithm shows improved performance that outperform and better generalized compared to the state-of-the-art.	-
dc.description.degree	Master	-
dc.description	Department of Computer Science and Engineering	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/73841	-
dc.identifier.uri	http://unist.dcollection.net/common/orgView/200000627563	-
dc.language	eng	-
dc.publisher	Ulsan National Institute of Science and Technology (UNIST)	-
dc.rights.embargoReleaseTerms	9999-12-31	-
dc.title.alternative	3D 손 포즈 추정을 위한 CLIP을 통한 이미지 없는 도메인 일반화	-
dc.title	Image-free domain generalization via CLIP for 3D hand pose estimation	-
dc.type	Thesis	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.