GPU in-Memory Processing Using Spark for Iterative Computation

Hong, Sumin; Choi, Woohyuk; Jeong, Won-Ki

doi:10.1109/CCGRID.2017.41

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Related Researcher

정원기

Jeong, Won-Ki

Read More

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

GPU in-Memory Processing Using Spark for Iterative Computation

Author(s): Hong, Sumin, Choi, Woohyuk, Jeong, Won-Ki

Issued Date: 2017-05-14

DOI: 10.1109/CCGRID.2017.41

URI: https://scholarworks.unist.ac.kr/handle/201301/32767

Fulltext: http://ieeexplore.ieee.org/document/7973686/

Citation: IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing

Abstract: Due to its simplicity and scalability, MapReduce has become a de facto standard computing model for big data processing. Since the original MapReduce model was only appropriate for embarrassingly parallel batch processing, many follow-up studies have focused on improving the efficiency and performance of the model. Spark follows one of these recent trends by providing in-memory processing capability to reduce slow disk I/O for iterative computing tasks. However, the acceleration of Spark's in-memory processing using graphics processing units (GPUs) is challenging due to its deep memory hierarchy and host-to-GPU communication overhead. In this paper, we introduce a novel GPU-accelerated MapReduce framework that extends Spark's in-memory processing so that iterative computing is performed only in the GPU memory. Having discovered that the main bottleneck in the current Spark system for GPU computing is data communication on a Java virtual machine, we propose a modification of the current Spark implementation to bypass expensive data management for iterative task offloading to GPUs. We also propose a novel GPU in-memory processing and caching framework that minimizes host-to-GPU communication via lazy evaluation and reuses GPU memory over multiple mapper executions. The proposed system employs message-passing interface (MPI)-based data synchronization for inter-worker communication so that more complicated iterative computing tasks, such as iterative numerical solvers, can be efficiently handled. We demonstrate the performance of our system in terms of several iterative computing tasks in big data processing applications, including machine learning and scientific computing. We achieved up to 50 times speed up over conventional Spark and about 10 times speed up over GPU-accelerated Spark.

Publisher: IEEE

Show Full Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1404 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.