File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

전명재

Jeon, Myeongjae
OMNIA
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

TPC: Target-Driven Parallelism Combining Prediction and Correction to Reduce Tail Latency in Interactive Services

Author(s)
Jeon, MyeongjaeHe, YuxiongKim, HwangjuElnikety, SamehRixner, ScottCox, Alan L.
Issued Date
2016-04-02
DOI
10.1145/2872362.2872370
URI
https://scholarworks.unist.ac.kr/handle/201301/35431
Fulltext
https://dl.acm.org/citation.cfm?doid=2872362.2872370
Citation
21st International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2016, pp.129 - 141
Abstract
In interactive services such as web search, recommendations, games and finance, reducing the tail latency is crucial to provide fast response to every user. Using web search as a driving example, we systematically characterize interactive workload to identify the opportunities and challenges for reducing tail latency. We find that the workload consists of mainly short requests that do not benefit from parallelism, and a few long requests which significantly impact the tail but exhibit high parallelism speedup. This motivates estimating request execution time, using a predictor, to identify long requests and to parallelize them. Prediction, however, is not perfect; a long request mispredicted as short is likely to contribute to the server tail latency, setting a ceiling on the achievable tail latency. We propose TPC, an approach that combines prediction information judiciously with dynamic correction for inaccurate prediction. Dynamic correction increases parallelism to accelerate a long request that is mispredicted as short. TPC carefully selects the appropriate target latencies based on system load and parallelism efficiency to reduce tail latency.

We implement TPC and several prior approaches to compare them experimentally on a single search server and on a cluster of 40 search servers. The experimental results show that TPC reduces the 99th- and 99.9th-percentile latency by up to 40% compared with the best prior work. Moreover, we evaluate TPC on a finance server, demonstrating its effectiveness on reducing tail latency of interactive services beyond web search.
Publisher
21st International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2016

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.