Navigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets

Satish, Nadathur; Sundaram, Narayanan; Patwary, Md. Mostofa Ali; Seo, Jiwon; Park, Jongsoo; Hassaan, M. Amber; Sengupta, Shubho; Yin, Zhaoming; Dubey, Pradeep

doi:10.1145/2588555.2610518

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Navigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets

Author(s): Satish, Nadathur, Sundaram, Narayanan, Patwary, Md. Mostofa Ali, Seo, Jiwon, Park, Jongsoo, Hassaan, M. Amber, Sengupta, Shubho, Yin, Zhaoming, Dubey, Pradeep

Issued Date: 2014-06-25

DOI: 10.1145/2588555.2610518

URI: https://scholarworks.unist.ac.kr/handle/201301/35588

Fulltext: http://dl.acm.org/citation.cfm?id=2588555.2610518

Citation: 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD 2014, pp.979 - 990

Abstract: Graph algorithms are becoming increasingly important for analyzing large datasets in many fields. Real-world graph data follows a pattern of sparsity, that is not uniform but highly skewed towards a few items. Implementing graph traversal, statistics and machine learning algorithms on such data in a scalable manner is quite challenging. As a result, several graph analytics frameworks (GraphLab, CombBLAS, Giraph, SociaLite and Galois among others) have been developed, each offering a solution with different programming models and targeted at different users. Unfortunately, the "Ninja performance gap" between optimized code and most of these frameworks is very large (2-30X for most frameworks and up to 560X for Giraph) for common graph algorithms, and moreover varies widely with algorithms. This makes the end-users' choiceof graph framework dependent not only on ease of use but also on performance. In this work, we offer a quantitative roadmap for improving the performance of all these frameworks and bridging the "ninja gap". We first present hand-optimized baselines that get performance close to hardware limits and higher than any published performance figure for these graph algorithms. We characterize the performance of both this native implementation as well as popular graph frameworks on a variety of algorithms. This study helps endusers delineate bottlenecks arising from the algorithms themselves vs. programming model abstractions vs. the framework implementations. Further, by analyzing the system-level behavior of these frameworks, we obtain bottlenecks that are agnostic to specific algorithms. We recommend changes to alleviate these bottlenecks (and implement some of them) and reduce the performance gap with respect to native code. These changes will enable end-users to choose frameworks based mostly on ease of use.

Publisher: 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD 2014

ISSN: 0730-8078

Show Full Item Record

qrcode

STATISTICS: Total View :281,023; Total Download :12,227; Today View :319

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1404 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.