Generalizing Human-Centric Representations For Pose Estimation and Hand-Object Interaction

Jeong, Uyoung

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Baek, Seungryul	-
dc.contributor.author	Jeong, Uyoung	-
dc.date.accessioned	2025-09-29T11:31:15Z	-
dc.date.available	2025-09-29T11:31:15Z	-
dc.date.issued	2025-08	-
dc.description.abstract	We study human representations to enhance generalization capabilities in downstream applications, specifically focusing on three challenging tasks: 2D multi-person pose estimation, unified multi-dataset training for 2D pose estimation, and photorealistic 3D hand-object interaction generation. Representation learning forms the foundational scaffold of deep learning systems, and its refinement has become essential in the era of general-purpose AI. In this work, we address critical challenges in three domains: instance-level discrimination in 2D multi-person pose estimation, representation unifica- tion across heterogeneous pose datasets, and photorealistic 3D hand-object interaction generation using large-scale generative models. The first study proposes BoIR, a bounding box-level instance representation learning framework that enhances robustness in densely populated scenes. Through a multi-task learning scheme that integrates contrastive instance embeddings with spatially enriched keypoint estimation, BoIR achieves state-of- the-art performance in multi-person pose estimation under occlusions. The second contribution, PoseBH, tackles the longstanding issue of skeletal heterogeneity in multi- dataset training. By introducing nonparametric keypoint prototypes within a unified embedding space and leveraging cross-type self-supervision, PoseBH effectively aligns semantically similar keypoints across diverse pose datasets. This approach demonstrates improved generalization to novel datasets while maintaining high accuracy on established benchmarks. The final study introduces THOM, a novel pipeline for text-guided generation of 3D hand-object interacting meshes. Addressing limitations in shape diversity and physical plausibility, THOM employs a two-stage optimization strategy grounded in Gaussian representation learning and enhanced with op- timization of the compositional Gaussians and interactions. This method enables the synthesis of topo- logically coherent and photorealistic 3D interactions, significantly outperforming existing approaches in semantic disentanglement and physical plausibility. Collectively, these contributions extend the frontiers of human representation learning from discrim- inative perception to generative modeling, suggesting a paradigm shift towards integrating large-scale multi-modal models in downstream human-centric tasks. This dissertation advocates for a thoughtful balance between domain specificity and general-purpose modeling to ensure robust and scalable re- search in human representation learning. As a final remark, we propose several future work directions that would further expand the boundaries of human-centric tasks.	-
dc.description.degree	Doctor	-
dc.description	Graduate School of Artificial Intelligence	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/88259	-
dc.identifier.uri	http://unist.dcollection.net/common/orgView/200000904455	-
dc.language	ENG	-
dc.publisher	Ulsan National Institute of Science and Technology	-
dc.rights.embargoReleaseDate	9999-12-31	-
dc.rights.embargoReleaseTerms	9999-12-31	-
dc.subject	computer vision,human pose estimation,multi-dataset training,multi-person pose estimation,hand-object interaction generation	-
dc.title	Generalizing Human-Centric Representations For Pose Estimation and Hand-Object Interaction	-
dc.type	Thesis	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.