Performance Analysis of MapReduce with In-Memory Caching in HDFS

DC Field Value Language
dc.contributor.advisor Choi, Young-Ri - Yoo, Tae-kyung - 2015-02-23T04:19:55Z - 2015-02-23T04:19:55Z - 2015-02 -
dc.identifier.uri -
dc.identifier.uri -
dc.description Department of Computer Engineering en_US
dc.description.abstract In this paper, we study the effects of HDFS in-memory caching on various MapReduce applications. We first evaluate the performance of seven MapReduce applications to understand different resource usage patterns. We then modify the centralized cache management system in HDFS such that individual blocks of a file can be cached. Using the modified system in HDFS, we compare the performance of MapReduce applications with in-memory caching to that without in-memory caching for workloads of a single MapReduce application and multiple MapReduce applications. In the experiments, the same workload was executed multiple times to see the effects of in-memory caching. Our experimental results show that the in-memory cache system can be beneficial to workloads of multiple I/O-intensive MapReduce applications, but the in-memory cache system cannot improve the performance of non-I/O- intensive MapReduce applications, possibly degrading the performance due to the overhead of in-memory caching. en_US
dc.description.statementofresponsibility open -
dc.language.iso en en_US
dc.publisher Graduate School of UNIST en_US
dc.subject MapReduce en_US
dc.subject Hadoop en_US
dc.subject Big data en_US
dc.subject In-memory caching en_US
dc.title Performance Analysis of MapReduce with In-Memory Caching in HDFS en_US
dc.type Master's thesis -
Appears in Collections:

find_unist can give you direct access to the published full text of this article. (UNISTARs only)

Show simple item record


  • mendeley


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.