File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

전명재

Jeon, Myeongjae
OMNIA
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Full metadata record

DC Field Value Language
dc.citation.conferencePlace US -
dc.citation.conferencePlace Georgia TechAtlanta -
dc.citation.endPage 141 -
dc.citation.startPage 129 -
dc.citation.title 21st International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2016 -
dc.contributor.author Jeon, Myeongjae -
dc.contributor.author He, Yuxiong -
dc.contributor.author Kim, Hwangju -
dc.contributor.author Elnikety, Sameh -
dc.contributor.author Rixner, Scott -
dc.contributor.author Cox, Alan L. -
dc.date.accessioned 2023-12-19T21:07:30Z -
dc.date.available 2023-12-19T21:07:30Z -
dc.date.created 2018-08-27 -
dc.date.issued 2016-04-02 -
dc.description.abstract In interactive services such as web search, recommendations, games and finance, reducing the tail latency is crucial to provide fast response to every user. Using web search as a driving example, we systematically characterize interactive workload to identify the opportunities and challenges for reducing tail latency. We find that the workload consists of mainly short requests that do not benefit from parallelism, and a few long requests which significantly impact the tail but exhibit high parallelism speedup. This motivates estimating request execution time, using a predictor, to identify long requests and to parallelize them. Prediction, however, is not perfect; a long request mispredicted as short is likely to contribute to the server tail latency, setting a ceiling on the achievable tail latency. We propose TPC, an approach that combines prediction information judiciously with dynamic correction for inaccurate prediction. Dynamic correction increases parallelism to accelerate a long request that is mispredicted as short. TPC carefully selects the appropriate target latencies based on system load and parallelism efficiency to reduce tail latency.

We implement TPC and several prior approaches to compare them experimentally on a single search server and on a cluster of 40 search servers. The experimental results show that TPC reduces the 99th- and 99.9th-percentile latency by up to 40% compared with the best prior work. Moreover, we evaluate TPC on a finance server, demonstrating its effectiveness on reducing tail latency of interactive services beyond web search.
-
dc.identifier.bibliographicCitation 21st International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2016, pp.129 - 141 -
dc.identifier.doi 10.1145/2872362.2872370 -
dc.identifier.scopusid 2-s2.0-84975230468 -
dc.identifier.uri https://scholarworks.unist.ac.kr/handle/201301/35431 -
dc.identifier.url https://dl.acm.org/citation.cfm?doid=2872362.2872370 -
dc.language 영어 -
dc.publisher 21st International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2016 -
dc.title TPC: Target-Driven Parallelism Combining Prediction and Correction to Reduce Tail Latency in Interactive Services -
dc.type Conference Paper -
dc.date.conferenceDate 2016-04-02 -

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.