File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

최영리

Choi, Young-Ri
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Coalescing HDFS Blocks to Avoid Recurring YARN Container Overhead

Author(s)
Kim, WonbaeChoi, Young-RiNam, Beomseok
Issued Date
2017-06-25
DOI
10.1109/CLOUD.2017.35
URI
https://scholarworks.unist.ac.kr/handle/201301/32754
Fulltext
http://ieeexplore.ieee.org/document/8030591/
Citation
IEEE International Conference on Cloud Computing, pp.214 - 221
Abstract
Hadoop clusters have been transitioning from a dedicated cluster environment to a shared cluster environment. This trend has resulted in the YARN container abstraction that isolates computing tasks from physical resources. With YARN containers, Hadoop has expanded to support various distributed frameworks. However, it has been reported that Hadoop tasks suffer from a significant overhead of container relaunch. In order to reduce the container overhead without making significant changes to the existing YARN framework, we propose leveraging the input split, which is the logical representation of physical HDFS blocks. Our assorted block coalescing scheme combines multiple HDFS blocks and creates large input splits of various sizes, reducing the number of containers and their initialization overhead. Our experimental study shows the assorted block coalescing scheme reduces the container overhead by a large margin while it achieves good load balance and job scheduling fairness without impairing the degree of overlap between map phase and reduce phase.
Publisher
IEEE
ISSN
2159-6182

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.