Double MAC on a DSP: Boosting the Performance of Convolutional Neural Networks on FPGAs

Lee, Sugil; Kim, Daewoo; Nguyen, Dong; Lee, Jongeun

doi:10.1109/TCAD.2018.2824280

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Related Researcher

이종은

Lee, Jongeun: Intelligent Computing and Codesign Lab.

Read More

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.citation.endPage	897	-
dc.citation.number	5	-
dc.citation.startPage	888	-
dc.citation.title	IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS	-
dc.citation.volume	38	-
dc.contributor.author	Lee, Sugil	-
dc.contributor.author	Kim, Daewoo	-
dc.contributor.author	Nguyen, Dong	-
dc.contributor.author	Lee, Jongeun	-
dc.date.accessioned	2023-12-21T19:12:07Z	-
dc.date.available	2023-12-21T19:12:07Z	-
dc.date.created	2018-06-12	-
dc.date.issued	2019-05	-
dc.description.abstract	Deep learning such as Convolutional Neural Networks (CNNs) are an important workload increasingly demanding high-performance hardware acceleration. One distinguishing feature of deep learnng workload is that it is inherently resilient to small numerical errors and works very well with low precision hardware. Thus we propose a novel method, called Double MAC, to theoretically double the computation rate of CNN accelerators by packing two multiply-and-accumulate (MAC) operations into one DSP block of off-the-shelf FPGAs. There are several technical challenges, which we overcome by exploiting the mode of operation in the CNN accelerator. We have validated our method through FPGA synthesis and Verilog simulation, and evaluated our method by applying it to the state-of-the-art CNN accelerator. We find that our Double MAC approach can increase the computation throughput of a CNN layer by twice. On the network level (all convolution layers combined), the performance improvement varies depending on the CNN application and FPGA size, from 14% to more than 80% over a highly optimized state-of-the-art accelerator solution, without sacrificing the output quality significantly.	-
dc.identifier.bibliographicCitation	IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, v.38, no.5, pp.888 - 897	-
dc.identifier.doi	10.1109/TCAD.2018.2824280	-
dc.identifier.issn	0278-0070	-
dc.identifier.scopusid	2-s2.0-85045193712	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/24220	-
dc.identifier.url	https://ieeexplore.ieee.org/document/8332524/	-
dc.identifier.wosid	000466037700009	-
dc.language	영어	-
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC	-
dc.title	Double MAC on a DSP: Boosting the Performance of Convolutional Neural Networks on FPGAs	-
dc.type	Article	-
dc.description.isOpenAccess	FALSE	-
dc.relation.journalWebOfScienceCategory	Computer Science, Hardware & Architecture; Computer Science, Interdisciplinary Applications; Engineering, Electrical & Electronic	-
dc.relation.journalResearchArea	Computer Science; Engineering	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordAuthor	Accelerator architectures	-
dc.subject.keywordAuthor	Convolution	-
dc.subject.keywordAuthor	Convolutional neural network	-
dc.subject.keywordAuthor	DSP (Digital Signal Processing) block	-
dc.subject.keywordAuthor	Field programmable gate arrays	-
dc.subject.keywordAuthor	FPGA	-
dc.subject.keywordAuthor	Hardware	-
dc.subject.keywordAuthor	MAC (Multiply-and-Accumulate).	-
dc.subject.keywordAuthor	reduced precision	-
dc.subject.keywordAuthor	SIMD (Single-Instruction Multiple-Data)	-
dc.subject.keywordAuthor	Table lookup	-
dc.subject.keywordAuthor	Throughput	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.