File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Vehicle Pose Estimation using Language Supervision

Author(s)
Muhammadjon, Boboev
Advisor
Baek, Seungryul
Issued Date
2024-08
URI
https://scholarworks.unist.ac.kr/handle/201301/84235 http://unist.dcollection.net/common/orgView/200000804007
Abstract
Accurate understanding of traffic scenes is crucial for autonomous driving, requiring precise vehicle pose and shape estimation. This paper introduces an approach that utilizes language-guided supervision to enhance vehicle keypoint detection and extend it to robust 6DoF pose estimation of vehicles in the ApolloCar3D dataset. By leveraging the linguistic capabilities of CLIP-pretrained image and text encoders, our method optimizes keypoint detection through specially crafted prompts tailored to individual keypoints. This approach establishes a connection between keypoint and prompt embeddings, enhancing keypoint detection precision and ensuring better pose estimation. Utilizing precise 2D keypoints alongside predefined 3D models, we employ the EPnP algorithm for 6DoF pose estimation and detailed 3D scene reconstruction. Our approach not only elevates the accuracy of keypoint detection but also significantly enhances 6DoF pose estimation performance, as supported by rigorous evaluations using precision metrics on the ApolloCar3D benchmark. This approach integrates linguistic insights and visual analytics to advance vehicle pose estimation, achieving significant improvements over existing methods.
Publisher
Ulsan National Institute of Science and Technology
Degree
Master
Major
Department of Computer Science and Engineering

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.