Efficient FPGA Acceleration of Convolutional Deep Neural Networks

ATUL RAHMAN

Scholarworks@UNIST

UNIST Library

File Download

000002301112.pdf

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Lee, Jongeun	-
dc.contributor.author	ATUL RAHMAN	-
dc.date.accessioned	2024-01-25T13:31:35Z	-
dc.date.available	2024-01-25T13:31:35Z	-
dc.date.issued	2016-08	-
dc.description.abstract	Deep Convolutional Neural Networks (CNNs) are a powerful model for visual recognition tasks, but due to their very high computational requirement, acceleration is highly desired. FPGA accelerators for CNNs are typically built around one large MAC (multiply-accumulate) array, which is repeatedly used to perform the computation of all convolution layers, which can be quite diverse and complex. Thus a key challenge is how to design a common architecture that can perform well for all convolutional layers. In this paper we present a highly optimized and cost-effective 3D neuron array architecture that is a natural FFt for convolutional layers, along with a parameter selection framework to optimize its parameters for any given CNN model. We show through theoretical as well as empirical analyses that structuring compute elements in a 3D rather than a 2D topology can lead to higher performance through an improved utilization of key FPGA resources. Our experimental results targeting a Virtex-7 FPGA demonstrate that our proposed technique can generate CNN accelerators that can outperform the state-of-the-art solution, by 1.80x to maximum 4.05x for 32-bit floating-point, and 16-bit fixed-point MAC implementation respectively for different CNN models. Additionally, our proposed technique can generate designs that are far more scalable in terms of compute resources. We also report on the energy consumption of our accelerator in comparison with a GPGPU implementation.	-
dc.description.degree	Master	-
dc.description	Department of Electrical and Computer Engineering	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/72054	-
dc.identifier.uri	http://unist.dcollection.net/jsp/common/DcLoOrgPer.jsp?sItemId=000002301112	-
dc.language	eng	-
dc.publisher	Ulsan National Institute of Science and Technology (UNIST)	-
dc.rights.embargoReleaseTerms	9999-12-31	-
dc.title	Efficient FPGA Acceleration of Convolutional Deep Neural Networks	-
dc.type	Thesis	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.