There are no files associated with this item.
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.citation.endPage | 897 | - |
dc.citation.number | 5 | - |
dc.citation.startPage | 888 | - |
dc.citation.title | IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS | - |
dc.citation.volume | 38 | - |
dc.contributor.author | Lee, Sugil | - |
dc.contributor.author | Kim, Daewoo | - |
dc.contributor.author | Nguyen, Dong | - |
dc.contributor.author | Lee, Jongeun | - |
dc.date.accessioned | 2023-12-21T19:12:07Z | - |
dc.date.available | 2023-12-21T19:12:07Z | - |
dc.date.created | 2018-06-12 | - |
dc.date.issued | 2019-05 | - |
dc.description.abstract | Deep learning such as Convolutional Neural Networks (CNNs) are an important workload increasingly demanding high-performance hardware acceleration. One distinguishing feature of deep learnng workload is that it is inherently resilient to small numerical errors and works very well with low precision hardware. Thus we propose a novel method, called Double MAC, to theoretically double the computation rate of CNN accelerators by packing two multiply-and-accumulate (MAC) operations into one DSP block of off-the-shelf FPGAs. There are several technical challenges, which we overcome by exploiting the mode of operation in the CNN accelerator. We have validated our method through FPGA synthesis and Verilog simulation, and evaluated our method by applying it to the state-of-the-art CNN accelerator. We find that our Double MAC approach can increase the computation throughput of a CNN layer by twice. On the network level (all convolution layers combined), the performance improvement varies depending on the CNN application and FPGA size, from 14% to more than 80% over a highly optimized state-of-the-art accelerator solution, without sacrificing the output quality significantly. | - |
dc.identifier.bibliographicCitation | IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, v.38, no.5, pp.888 - 897 | - |
dc.identifier.doi | 10.1109/TCAD.2018.2824280 | - |
dc.identifier.issn | 0278-0070 | - |
dc.identifier.scopusid | 2-s2.0-85045193712 | - |
dc.identifier.uri | https://scholarworks.unist.ac.kr/handle/201301/24220 | - |
dc.identifier.url | https://ieeexplore.ieee.org/document/8332524/ | - |
dc.identifier.wosid | 000466037700009 | - |
dc.language | 영어 | - |
dc.publisher | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC | - |
dc.title | Double MAC on a DSP: Boosting the Performance of Convolutional Neural Networks on FPGAs | - |
dc.type | Article | - |
dc.description.isOpenAccess | FALSE | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Hardware & Architecture; Computer Science, Interdisciplinary Applications; Engineering, Electrical & Electronic | - |
dc.relation.journalResearchArea | Computer Science; Engineering | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.subject.keywordAuthor | Accelerator architectures | - |
dc.subject.keywordAuthor | Convolution | - |
dc.subject.keywordAuthor | Convolutional neural network | - |
dc.subject.keywordAuthor | DSP (Digital Signal Processing) block | - |
dc.subject.keywordAuthor | Field programmable gate arrays | - |
dc.subject.keywordAuthor | FPGA | - |
dc.subject.keywordAuthor | Hardware | - |
dc.subject.keywordAuthor | MAC (Multiply-and-Accumulate). | - |
dc.subject.keywordAuthor | reduced precision | - |
dc.subject.keywordAuthor | SIMD (Single-Instruction Multiple-Data) | - |
dc.subject.keywordAuthor | Table lookup | - |
dc.subject.keywordAuthor | Throughput | - |
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Tel : 052-217-1404 / Email : scholarworks@unist.ac.kr
Copyright (c) 2023 by UNIST LIBRARY. All rights reserved.
ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.