File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

안혜민

Ahn, Hyemin
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Full metadata record

DC Field Value Language
dc.citation.endPage 585 -
dc.citation.number 6 -
dc.citation.startPage 581 -
dc.citation.title Journal of Institute of Control, Robotics and Systems -
dc.citation.volume 31 -
dc.contributor.author Kim, Seong Hyeon -
dc.contributor.author Ahn, Hyemin -
dc.date.accessioned 2026-04-22T18:00:11Z -
dc.date.available 2026-04-22T18:00:11Z -
dc.date.created 2026-04-22 -
dc.date.issued 2025-06 -
dc.description.abstract A world model allows robots to understand and predict the interplay between their actions and environmental dynamics. Recent advancements in diffusion models have significantly improved the quality of image frame generation in simulated environments, contributing to the development of more robust and generalized world models. However, these diffusion-based world models often depend on discrete inputs, such as keyboard commands, which limit their applicability to continuous real-world robotic control. To address this limitation, we propose a novel framework that integrates contrastive learning to align visual and proprioceptive modalities (e.g., joint positions) within a shared latent space. This shared latent space facilitates accurate cross-modal predictions between visual scenes and proprioceptive states. By combining this latent representation with a diffusion model, our world model can generate long-term future visual scenes by leveraging both initial visual observations and proprioceptive states. Experimental results demonstrate that the proposed framework generates high-fidelity, long-term future visual scenes when provided with target proprioceptive data. This capability enables robots to plan their motions solely based on the generated images, enabling imagination-based planning. © ICROS 2025. -
dc.identifier.bibliographicCitation Journal of Institute of Control, Robotics and Systems, v.31, no.6, pp.581 - 585 -
dc.identifier.doi 10.5302/J.ICROS.2025.25.0050 -
dc.identifier.issn 1976-5622 -
dc.identifier.scopusid 2-s2.0-105007990021 -
dc.identifier.uri https://scholarworks.unist.ac.kr/handle/201301/91452 -
dc.identifier.url https://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE12246460 -
dc.language 영어 -
dc.publisher Institute of Control, Robotics and Systems -
dc.title.alternative 고유 감각 정보 기반 시각적 장면 생성을 통한 로봇 세계 모델링을 가능케하는 대조 학습 및 디퓨전 모델 -
dc.title Proprioception-conditioned Visual Scene Generation for Robot World Modeling via Contrastive Learning and Diffusion -
dc.type Article -
dc.description.isOpenAccess FALSE -
dc.identifier.kciid ART003208269 -
dc.type.docType Article -
dc.description.journalRegisteredClass scopus -
dc.description.journalRegisteredClass kci -

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.