| dc.contributor.advisor |
Baek, Seungryul |
- |
| dc.contributor.author |
LEE, SEONGYEONG |
- |
| dc.date.accessioned |
2024-01-29T15:39:23Z |
- |
| dc.date.available |
2024-01-29T15:39:23Z |
- |
| dc.date.issued |
2022-08 |
- |
| dc.description.abstract |
RGB-based 3D hand pose estimation has been successful for decades thanks to large databases and deep learning. However, the domain gap between datasets is not yet complete. For various reasons such as the distribution of background features and the shape of the hand, generalization may not be good when testing on a target domain different from the dataset used for training. To solve this problem, many existing papers tried to reduce the domain gap and improve the performance of 3D hand pose estimation by using the domain adaptation method to generalize well by using additional unconstrained/target domain images. However, in an environment where other additional images are difficult to obtain, there is still a need to solve the problem to reduce the domain gap. In this paper, we present a pose estimation framework that uses only source domain data and uses features to generalize well to many unseen data domains. The core of our algorithm is that we achieve generalization effects in different domains using the image encoder of the CLIP(Contrastive Language-Image Pre-Training), which performs well in image representation by contrastive learning with a large number of image-text pairs. In addition, by using a text encoder to effect additional augmentation, general performance was improved for images that were not seen during training. Consequently, in experiments with two datasets STB and RHD representing challenging real-world scenarios, our algorithm shows improved performance that outperform and better generalized compared to the state-of-the-art. |
- |
| dc.description.degree |
Master |
- |
| dc.description |
Department of Computer Science and Engineering |
- |
| dc.identifier.uri |
https://scholarworks.unist.ac.kr/handle/201301/73841 |
- |
| dc.identifier.uri |
http://unist.dcollection.net/common/orgView/200000627563 |
- |
| dc.language |
eng |
- |
| dc.publisher |
Ulsan National Institute of Science and Technology (UNIST) |
- |
| dc.rights.embargoReleaseTerms |
9999-12-31 |
- |
| dc.title.alternative |
3D 손 포즈 추정을 위한 CLIP을 통한 이미지 없는 도메인 일반화 |
- |
| dc.title |
Image-free domain generalization via CLIP for 3D hand pose estimation |
- |
| dc.type |
Thesis |
- |