BIGS: Bimanual Category-agnostic Interaction Reconstruction from Monocular Videos via 3D Gaussian Splatting

On, Jeongwan; Gwak, Kyeonghwan; Kang, Gunyoung; Cha, Junuk; Hwang, Soohyun; Hwang, Hyein; Baek, Seungryul

doi:10.1109/CVPR52734.2025.01625

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Related Researcher

백승렬

Baek, Seungryul: UNIST VISION AND LEARNING LAB.

Read More

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.citation.conferencePlace	US	-
dc.citation.endPage	17447	-
dc.citation.startPage	17437	-
dc.citation.title	IEEE Conference on Computer Vision and Pattern Recognition	-
dc.contributor.author	On, Jeongwan	-
dc.contributor.author	Gwak, Kyeonghwan	-
dc.contributor.author	Kang, Gunyoung	-
dc.contributor.author	Cha, Junuk	-
dc.contributor.author	Hwang, Soohyun	-
dc.contributor.author	Hwang, Hyein	-
dc.contributor.author	Baek, Seungryul	-
dc.date.accessioned	2025-12-01T16:03:30Z	-
dc.date.available	2025-12-01T16:03:30Z	-
dc.date.created	2025-11-29	-
dc.date.issued	2025-06-14	-
dc.description.abstract	Reconstructing 3Ds of hand-object interaction (HOI) is a fundamental problem that can find numerous applications. Despite recent advances, there is no comprehensive pipeline yet for bimanual class-agnostic interaction reconstruction from a monocular RGB video, where two hands and an unknown object are interacting with each other. Previous works tackled the limited hand-object interaction case, where object templates are pre-known or only one hand is involved in the interaction. The bimanual interaction reconstruction exhibits severe occlusions introduced by complex interactions between two hands and an object. To solve this, we first introduce BIGS (Bimanual Interaction 3D Gaussian Splatting), a method that reconstructs 3D Gaussians of hands and an unknown object from a monocular video. To robustly obtain object Gaussians avoiding severe occlusions, we leverage prior knowledge of pre-trained diffusion model with score distillation sampling (SDS) loss, to reconstruct unseen object parts. For hand Gaussians, we exploit the 3D priors of hand model (i.e., MANO) and share a single Gaussian for two hands to effectively accumulate hand 3D information, given limited views. To further consider the 3D alignment between hands and objects, we include the interacting-subjects optimization step during Gaussian optimization. Our method achieves the state-of-the-art accuracy on two challenging datasets, in terms of 3D hand pose estimation (MPJPE), 3D object reconstruction (CDh, CDo, F10), and rendering quality (PSNR, SSIM, LPIPS), respectively.	-
dc.identifier.bibliographicCitation	IEEE Conference on Computer Vision and Pattern Recognition, pp.17437 - 17447	-
dc.identifier.doi	10.1109/CVPR52734.2025.01625	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/88733	-
dc.language	영어	-
dc.publisher	IEEE Computer Society	-
dc.title	BIGS: Bimanual Category-agnostic Interaction Reconstruction from Monocular Videos via 3D Gaussian Splatting	-
dc.type	Conference Paper	-
dc.date.conferenceDate	2025-06-11	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.