File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Full metadata record

DC Field Value Language
dc.contributor.advisor Yang, Seungjoon -
dc.contributor.author YAHYOZODA, NASRULLOH -
dc.date.accessioned 2026-03-26T22:13:55Z -
dc.date.available 2026-03-26T22:13:55Z -
dc.date.issued 2026-02 -
dc.description.abstract Indoor scene object detection from 3D point clouds remains computationally intensive despite significant advances in deep learning architectures. Existing approaches—whether voxel-based, point-based, or transformer-based—face inherent trade-offs between detection accuracy and com- putational efficiency, limiting their applicability in real-time scenarios such as robotics and aug- mented reality. This thesis introduces a novel top-view data representation and 2.5D detection framework that achieves substantial computational efficiency gains while maintaining competitive detection accuracy. The core innovation lies in a carefully designed data representation: a Bird’s Eye View (BEV) top-view density projection that compresses 3D voxelized point clouds into 2D density maps encoding vertical occupancy information. This representation preserves essential geometric characteristics for object detection while reducing computational complexity from cubic to quadratic scaling. Building upon this representation, we adapt the YOLOv11 architecture for processing top- view density maps, achieving 123 frames per second (FPS) inference speed with minimal GPU memory footprint of 1.61 GB. Furthermore, we demonstrate that training on a unified dataset combining five benchmark indoor scene datasets with balanced sampling substantially improves generalization performance across diverse indoor environments. Experimental results show that our approach achieves 10× speedup over existing point-based methods and 5–6× improvement over efficient voxel methods, combined with 77% reduction in memory consumption, while maintaining competitive detection accuracy. Keywords: Indoor scene understanding, 3D object detection, point cloud processing, real-time detection, bird’s eye view representation, deep learning. -
dc.description.degree Master -
dc.description Department of Electrical Engineering -
dc.identifier.uri https://scholarworks.unist.ac.kr/handle/201301/90956 -
dc.identifier.uri http://unist.dcollection.net/common/orgView/200000965879 -
dc.language ENG -
dc.publisher Ulsan National Institute of Science and Technology -
dc.subject Photoacoustic -
dc.title 2.5D Real-Time Detection on Voxelized Indoor Scene Point Clouds -
dc.type Thesis -

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.