Locality-Aware Fair Scheduling in the Distributed Query Processing Framework

DC Field Value Language
dc.contributor.advisor Nam, Beomseok - Eom, Youngmoon - 2015-02-09T02:57:34Z - 2015-02-09T02:57:34Z - 2015-02 -
dc.identifier.uri -
dc.identifier.uri -
dc.description Department of Computer Engineering en_US
dc.description.abstract Utilizing caching facilities in modern query processing systems is getting more important as the capacity of main memory is having been greatly increasing. Especially in the data intensive applications, caching effect gives significant performance gain avoiding disk I/O which is highly expensive than memory access. Therefore data must be carefully distributed across back-end application servers to get advantages from caching as much as possible. On the other hand, load balance across back-end application servers is another concern the scheduler must consider. Serious load imbalance may result in poor performance even if the cache hit ratio is high. And the fact that scheduling decision which raises cache hit ratio sometimes results in load imbalance even makes it harder to make scheduling decision. Therefore we should find a scheduling algorithm which balances trade-off between load balance and cache hit ratio successfully. To consider both cache hit and load balance, we propose two semantic caching mechanisms DEMB and EM-KDE which successfully balance the load while keeping high cache hit ratio by analyzing and predicting trend of query arrival patterns. Another concern discussed in this paper is the environment with multiple front-end schedulers. Each scheduler can have different query arrival pattern from users. To reflect those differences of query arrival pattern from each front-end scheduler, we compare 3 algorithms which aggregate the query arrival pattern information from each front-end scheduler and evaluate them. To increase cache hit ratio in semantic caching scheduling further, migrating contents of cache to nearby server is proposed. We can increase cache hit count if data can be dynamically migrated to the server where the subsequent data requests supposed to be forwarded. Several migrating policies and their pros and cons will be discussed later. Finally, we introduce a MapReduce framework called Eclipse which takes full advantages from semantic caching scheduling algorithm mentioned above. We show that Eclipse outperforms other MapReduce frameworks in most evaluations. en_US
dc.description.statementofresponsibility open -
dc.language.iso en en_US
dc.publisher Graduate school of UNIST en_US
dc.subject MapReduce en_US
dc.subject Distributed System en_US
dc.subject Query Processing en_US
dc.title Locality-Aware Fair Scheduling in the Distributed Query Processing Framework en_US
dc.type Master's thesis -
Appears in Collections:

find_unist can give you direct access to the published full text of this article. (UNISTARs only)

Show simple item record


  • mendeley


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.