BROWSE

Related Researcher

Author's Photo

Lee, Jongeun
Renew: Reconfigurable and Neuromorphic Computing Lab
Research Interests
  • Reconfigurable processor architecture, neuromorphic processor, stochastic computing

ITEM VIEW & DOWNLOAD

Evaluator-executor transformation for efficient pipelining of loops with conditionals

DC Field Value Language
dc.contributor.author Jeong, Yeonghun ko
dc.contributor.author Seo, Seongseok ko
dc.contributor.author Lee, Jongeun ko
dc.date.available 2014-04-10T02:36:44Z -
dc.date.created 2014-02-07 ko
dc.date.issued 2013-12 -
dc.identifier.citation ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, v.10, no.4, pp.1 - 23 ko
dc.identifier.issn 1544-3566 ko
dc.identifier.uri https://scholarworks.unist.ac.kr/handle/201301/4114 -
dc.identifier.uri http://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=84892471505 ko
dc.description.abstract Control divergence poses many problems in parallelizing loops. While predicated execution is commonly used to convert control dependence into data dependence, it often incurs high overhead because it allocates resources equally for both branches of a conditional statement regardless of their execution frequencies. For those loops with unbalanced conditionals, we propose a software transformation that divides a loop into two or three smaller loops so that the condition is evaluated only in the first loop, while the less frequent branch is executed in the second loop in a way that is much more efficient than in the original loop. To reduce the overhead of extra data transfer caused by the loop fission, we also present a hardware extension for a class of Coarse-Grained Reconfigurable Architectures (CGRAs). Our experiments using MiBench and computer vision benchmarks on a CGRA demonstrate that our techniques can improve the performance of loops over predicated execution by up to 65% (37.5%, on average), when the hardware extension is enabled. Without any hardware modification, our software-only version can improve performance by up to 64% (33%, on average), while simultaneously reducing the energy consumption of the entire CGRA including configuration and data memory by 22%, on average. ko
dc.description.statementofresponsibility close -
dc.language ENG ko
dc.publisher ASSOC COMPUTING MACHINERY ko
dc.subject Coarse-Grained Reconfigurable Architecture (CGRA) ko
dc.subject Conditional statements ko
dc.subject Control divergence ko
dc.subject Loop fission ko
dc.subject Predicated execution ko
dc.subject Software pipelining ko
dc.title Evaluator-executor transformation for efficient pipelining of loops with conditionals ko
dc.type ARTICLE ko
dc.identifier.scopusid 2-s2.0-84892471505 ko
dc.identifier.wosid 000330509300043 ko
dc.type.rims ART ko
dc.description.wostc 0 *
dc.description.scopustc 0 *
dc.date.tcdate 2014-10-18 *
dc.date.scptcdate 2014-07-12 *
dc.identifier.doi 10.1145/2555289.2555317 ko
Appears in Collections:
EE_Journal Papers

find_unist can give you direct access to the published full text of this article. (UNISTARs only)

Show simple item record

qrcode

  • mendeley

    citeulike

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

MENU