File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

이종은

Lee, Jongeun
Intelligent Computing and Codesign Lab.
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Improving performance of loops on DIAM-based VLIW architectures

Author(s)
Lee, JinyongLee, JongwonPaek, YunheungLee, Jongeun
Issued Date
2014-05
DOI
10.1145/2597809.2597825
URI
https://scholarworks.unist.ac.kr/handle/201301/6401
Fulltext
http://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=84907030419
Citation
ACM SIGPLAN NOTICES, v.49, no.5, pp.135 - 144
Abstract
Recent studies show that very long instruction word (VLIW) architectures, which inherently have wide datapath (e.g. 128 or 256 bits for one VLIW instruction word), can benefit from dynamic implied addressing mode (DIAM) and can achieve lower power consumption and smaller code size with a small performance overhead. Such overhead, which is claimed to be small, is mainly caused by the execution of additionally generated special instructions for conveying information that cannot be encoded in reduced instruction bit-width. In this paper, however, we show that the performance impact of applying DIAM on VLIW architecture cannot be overlooked expecially when applications possess high level of instruction level parallelism (ILP), which is mostly the case for loops because of the result of aggressive code scheduling. We also propose a way to relieve the performance degradation especially focusing on loops since loops spend almost 90% of total execution time in programs and tend to have high ILP. We first implement the original DIAM compilation technique in a compiler, and augment it with the proposed loop optimization scheme to show that ours can clearly alleviate the performance loss caused by the excessive number of additional instructions, with the help of slightly modified hardware. Moreover, the well-known loop unrolling scheme, which would produce denser code in loops at the cost of substantial code size bloating, is integrated into our compiler. The experiment result shows that the loop unrolling technique, combined with our augmented DIAM scheme, produces far better code in terms of performance with quite an acceptable amount of code increase.
Publisher
ASSOC COMPUTING MACHINERY
ISSN
0362-1340

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.