Locality-Aware Fair Scheduling in the Distributed Query Processing Framework

Eom, Youngmoon

Scholarworks@UNIST

UNIST Library

File Download

000001925183.pdf

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Nam, Beomseok	-
dc.contributor.author	Eom, Youngmoon	-
dc.date.accessioned	2024-01-24T15:26:42Z	-
dc.date.available	2024-01-24T15:26:42Z	-
dc.date.issued	2015-02	-
dc.description.abstract	Utilizing caching facilities in modern query processing systems is getting more important as the capacity of main memory is having been greatly increasing. Especially in the data intensive applications, caching effect gives significant performance gain avoiding disk I/O which is highly expensive than memory access. Therefore data must be carefully distributed across back-end application servers to get advantages from caching as much as possible. On the other hand, load balance across back-end application servers is another concern the scheduler must consider. Serious load imbalance may result in poor performance even if the cache hit ratio is high. And the fact that scheduling decision which raises cache hit ratio sometimes results in load imbalance even makes it harder to make scheduling decision. Therefore we should find a scheduling algorithm which balances trade-off between load balance and cache hit ratio successfully. To consider both cache hit and load balance, we propose two semantic caching mechanisms DEMB and EM-KDE which successfully balance the load while keeping high cache hit ratio by analyzing and predicting trend of query arrival patterns. Another concern discussed in this paper is the environment with multiple front-end schedulers. Each scheduler can have different query arrival pattern from users. To reflect those differences of query arrival pattern from each front-end scheduler, we compare 3 algorithms which aggregate the query arrival pattern information from each front-end scheduler and evaluate them. To increase cache hit ratio in semantic caching scheduling further, migrating contents of cache to nearby server is proposed. We can increase cache hit count if data can be dynamically migrated to the server where the subsequent data requests supposed to be forwarded. Several migrating policies and their pros and cons will be discussed later. Finally, we introduce a MapReduce framework called Eclipse which takes full advantages from semantic caching scheduling algorithm mentioned above. We show that Eclipse outperforms other MapReduce frameworks in most evaluations.	-
dc.description.degree	Master	-
dc.description	Department of Computer Engineering	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/71884	-
dc.identifier.uri	http://unist.dcollection.net/jsp/common/DcLoOrgPer.jsp?sItemId=000001925183	-
dc.language	eng	-
dc.publisher	Ulsan National Institute of Science and Technology (UNIST)	-
dc.rights.embargoReleaseDate	9999-12-31	-
dc.rights.embargoReleaseTerms	9999-12-31	-
dc.subject	MapReduce, Distributed System, Query Processing	-
dc.title	Locality-Aware Fair Scheduling in the Distributed Query Processing Framework	-
dc.type	Thesis	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1404 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.