Evaluator-executor transformation for efficient pipelining of loops with conditionals

Jeong, Yeonghun; Seo, Seongseok; Lee, Jongeun

doi:10.1145/2555289.2555317

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Related Researcher

이종은

Lee, Jongeun: Intelligent Computing and Codesign Lab.

Read More

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.citation.endPage	23	-
dc.citation.number	4	-
dc.citation.startPage	1	-
dc.citation.title	ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION	-
dc.citation.volume	10	-
dc.contributor.author	Jeong, Yeonghun	-
dc.contributor.author	Seo, Seongseok	-
dc.contributor.author	Lee, Jongeun	-
dc.date.accessioned	2023-12-22T03:11:00Z	-
dc.date.available	2023-12-22T03:11:00Z	-
dc.date.created	2014-02-07	-
dc.date.issued	2013-12	-
dc.description.abstract	Control divergence poses many problems in parallelizing loops. While predicated execution is commonly used to convert control dependence into data dependence, it often incurs high overhead because it allocates resources equally for both branches of a conditional statement regardless of their execution frequencies. For those loops with unbalanced conditionals, we propose a software transformation that divides a loop into two or three smaller loops so that the condition is evaluated only in the first loop, while the less frequent branch is executed in the second loop in a way that is much more efficient than in the original loop. To reduce the overhead of extra data transfer caused by the loop fission, we also present a hardware extension for a class of Coarse-Grained Reconfigurable Architectures (CGRAs). Our experiments using MiBench and computer vision benchmarks on a CGRA demonstrate that our techniques can improve the performance of loops over predicated execution by up to 65% (37.5%, on average), when the hardware extension is enabled. Without any hardware modification, our software-only version can improve performance by up to 64% (33%, on average), while simultaneously reducing the energy consumption of the entire CGRA including configuration and data memory by 22%, on average.	-
dc.identifier.bibliographicCitation	ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, v.10, no.4, pp.1 - 23	-
dc.identifier.doi	10.1145/2555289.2555317	-
dc.identifier.issn	1544-3566	-
dc.identifier.scopusid	2-s2.0-84892471505	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/4114	-
dc.identifier.url	http://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=84892471505	-
dc.identifier.wosid	000330509300043	-
dc.language	영어	-
dc.publisher	ASSOC COMPUTING MACHINERY	-
dc.title	Evaluator-executor transformation for efficient pipelining of loops with conditionals	-
dc.type	Article	-
dc.relation.journalWebOfScienceCategory	Computer Science, Hardware & Architecture; Computer Science, Theory & Methods	-
dc.relation.journalResearchArea	Computer Science	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.