BROWSE

Related Researcher

Author's Photo

Nam, Beomseok
Data Intensive Computing Lab
Research Interests
  • Distributed and parallel computing, high performance computing, database systems, OS and storage systems

ITEM VIEW & DOWNLOAD

Exploiting Massive Parallelism for Indexing Multi-dimensional Datasets on the GPU

Cited 0 times inthomson ciCited 0 times inthomson ci
Title
Exploiting Massive Parallelism for Indexing Multi-dimensional Datasets on the GPU
Author
Nam, BeomseokKim, JinwoongJeong, Won-Ki
Issue Date
2015-08
Publisher
IEEE COMPUTER SOC
Citation
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, v.26, no.8, pp.2258 - 2271
Abstract
Inherently multi-dimensional n-ary indexing structures such as R-trees are not well suited for the GPU because of their irregular memory access patterns and recursive back-tracking function calls. It has been known that traversing hierarchical tree structures in an irregular manner makes it difficult to exploit parallelism and to maximize the utilization of GPU processing units. Moreover, the recursive tree search algorithms often fail with large indexes because of the GPU's tiny runtime stack size. In this paper, we propose a novel parallel tree traversal algorithm-massively parallel restart scanning (MPRS) for multi-dimensional range queries that avoids recursion and irregular memory access. The proposed MPRS algorithm traverses hierarchical tree structures with mostly contiguous memory access patterns without recursion, which offers more chances to optimize the parallel SIMD algorithm. We implemented the proposed MPRS range query processing algorithm on n-ary bounding volume hierarchies including R-trees and evaluated its performance using real scientific datasets on an NVIDIA Tesla M2090 GPU. Our experiments show braided parallel SIMD friendly MPRS range query algorithm achieves at least 80 percent warp execution efficiency while task parallel tree traversal algorithm shows only 9-15 percent efficiency. Moreover, braided parallel MPRS algorithm accesses 7-20 times less amount of global memory than task parallel parent link algorithm by virtue of minimal warp divergence.
URI
https://scholarworks.unist.ac.kr/handle/201301/9530
URL
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6876171
DOI
10.1109/TPDS.2014.2347041
ISSN
1045-9219
Appears in Collections:
EE_Journal Papers
Files in This Item:
There are no files associated with this item.

find_unist can give you direct access to the published full text of this article. (UNISTARs only)

Show full item record

qrcode

  • mendeley

    citeulike

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

MENU