File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Navigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets

Author(s)
Satish, NadathurSundaram, NarayananPatwary, Md. Mostofa AliSeo, JiwonPark, JongsooHassaan, M. AmberSengupta, ShubhoYin, ZhaomingDubey, Pradeep
Issued Date
2014-06-25
DOI
10.1145/2588555.2610518
URI
https://scholarworks.unist.ac.kr/handle/201301/35588
Fulltext
http://dl.acm.org/citation.cfm?id=2588555.2610518
Citation
2014 ACM SIGMOD International Conference on Management of Data, SIGMOD 2014, pp.979 - 990
Abstract
Graph algorithms are becoming increasingly important for analyzing large datasets in many fields. Real-world graph data follows a pattern of sparsity, that is not uniform but highly skewed towards a few items. Implementing graph traversal, statistics and machine learning algorithms on such data in a scalable manner is quite challenging. As a result, several graph analytics frameworks (GraphLab, CombBLAS, Giraph, SociaLite and Galois among others) have been developed, each offering a solution with different programming models and targeted at different users. Unfortunately, the "Ninja performance gap" between optimized code and most of these frameworks is very large (2-30X for most frameworks and up to 560X for Giraph) for common graph algorithms, and moreover varies widely with algorithms. This makes the end-users' choiceof graph framework dependent not only on ease of use but also on performance. In this work, we offer a quantitative roadmap for improving the performance of all these frameworks and bridging the "ninja gap". We first present hand-optimized baselines that get performance close to hardware limits and higher than any published performance figure for these graph algorithms. We characterize the performance of both this native implementation as well as popular graph frameworks on a variety of algorithms. This study helps endusers delineate bottlenecks arising from the algorithms themselves vs. programming model abstractions vs. the framework implementations. Further, by analyzing the system-level behavior of these frameworks, we obtain bottlenecks that are agnostic to specific algorithms. We recommend changes to alleviate these bottlenecks (and implement some of them) and reduce the performance gap with respect to native code. These changes will enable end-users to choose frameworks based mostly on ease of use.
Publisher
2014 ACM SIGMOD International Conference on Management of Data, SIGMOD 2014
ISSN
0730-8078

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.