File Download

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Full metadata record

DC Field Value Language
dc.contributor.advisor Nam, Beomseok -
dc.contributor.author Eom, Youngmoon -
dc.date.accessioned 2024-01-24T15:26:42Z -
dc.date.available 2024-01-24T15:26:42Z -
dc.date.issued 2015-02 -
dc.description.abstract Utilizing caching facilities in modern query processing systems is getting more important as the capacity of main memory is having been greatly increasing. Especially in the data intensive applications, caching effect gives significant performance gain avoiding disk I/O which is highly expensive than memory access. Therefore data must be carefully distributed across back-end application servers to get advantages from caching as much as possible.
On the other hand, load balance across back-end application servers is another concern the scheduler must consider. Serious load imbalance may result in poor performance even if the cache hit ratio is high. And the fact that scheduling decision which raises cache hit ratio sometimes results in load imbalance even makes it harder to make scheduling decision. Therefore we should find a scheduling algorithm which balances trade-off between load balance and cache hit ratio successfully.
To consider both cache hit and load balance, we propose two semantic caching mechanisms DEMB and EM-KDE which successfully balance the load while keeping high cache hit ratio by analyzing and predicting trend of query arrival patterns.
Another concern discussed in this paper is the environment with multiple front-end schedulers. Each scheduler can have different query arrival pattern from users. To reflect those differences of query arrival pattern from each front-end scheduler, we compare 3 algorithms which aggregate the query arrival pattern information from each front-end scheduler and evaluate them.
To increase cache hit ratio in semantic caching scheduling further, migrating contents of cache to nearby server is proposed. We can increase cache hit count if data can be dynamically migrated to the server where the subsequent data requests supposed to be forwarded. Several migrating policies and their pros and cons will be discussed later.
Finally, we introduce a MapReduce framework called Eclipse which takes full advantages from semantic caching scheduling algorithm mentioned above. We show that Eclipse outperforms other MapReduce frameworks in most evaluations.
-
dc.description.degree Master -
dc.description Department of Computer Engineering -
dc.identifier.uri https://scholarworks.unist.ac.kr/handle/201301/71884 -
dc.identifier.uri http://unist.dcollection.net/jsp/common/DcLoOrgPer.jsp?sItemId=000001925183 -
dc.language eng -
dc.publisher Ulsan National Institute of Science and Technology (UNIST) -
dc.rights.embargoReleaseDate 9999-12-31 -
dc.rights.embargoReleaseTerms 9999-12-31 -
dc.subject MapReduce, Distributed System, Query Processing -
dc.title Locality-Aware Fair Scheduling in the Distributed Query Processing Framework -
dc.type Thesis -

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.