Evaluator-Executor Transformation for Efficient Conditional Statements on CGRA
Cited 0 times inCited 0 times in
- Evaluator-Executor Transformation for Efficient Conditional Statements on CGRA
- Jeong, Yeonghun
- Lee, Jongeun
- Issue Date
- Graduate School of UNIST
- Control divergence poses many problems in parallelizing loops. While predicated execution is commonly used to convert control dependence into data dependence, it often incurs high overhead because it allocates resources equally for both branches of a conditional statement regardless of their execution frequencies. For those loops with unbalanced conditionals, we propose a software transformation that divides a loop into two or three smaller loops so that the condition is evaluated only in the first loop while the less frequent branch is executed in the second loop in a way that is much more efficient than in the original loop. To reduce the overhead of extra data transfer caused by the loop fission, we also present a hardware extension for a class of coarse-grained reconfigurable architectures (CGRAs). Our experiments using MiBench and computer vision benchmarks on a CGRA demonstrate that our techniques can improve the performance of loops over predicated execution by up to 65%, or 38.0% on average when the hardware extension is enabled. Without any hardware modification, our software-only version can improve performance by up to 64%, or 33.2% on average, while simultaneously reducing the energy consumption of the entire CGRA including configuration and data memory by 22.0% on average.
- Computer Engineering
- ; Go to Link
Appears in Collections:
- Files in This Item:
can give you direct access to the published full text of this article. (UNISTARs only)
Show full item record
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.