File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

이종은

Lee, Jongeun
Intelligent Computing and Codesign Lab.
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Extending Neural Processing Unit and Compiler for Advanced Binarized Neural Networks

Author(s)
Song, MinjoonAsim, FaaizLee, Jongeun
Issued Date
2024-01-22
DOI
10.1109/ASP-DAC58780.2024.10473822
URI
https://scholarworks.unist.ac.kr/handle/201301/85863
Citation
29th Asia and South Pacific Design Automation Conference, ASP-DAC 2024, pp.115 - 120
Abstract
Binarized neural networks (BNNs) are one of the most promising approaches to deploy deep neural network models on resource-constrained devices. However, there is very little support on compilers and programmable accelerators for BNNs especially with the modern BNNs that use scale factors and skip connections to maximize network performance. In this paper we present a set of methods to extend a neural processing unit (NPU) and a compiler to support modern BNNs. Our novel ideas include (i) batch-norm folding for binarized layers with scale factors and skip connections, (ii) efficient handling of convolutions with few input channels, and (iii) bit-packing pipelining. Our evaluation using BiRealNet-18 on an FPGA board demonstrates that our compiler-architecture hybrid approach can yield significant speedups for binary convolution layers over the baseline NPU. Also our approach gives 3.65.5 × better end-to-end performance on BiRealNet-18 compared with previous BNN compiler approaches.
Publisher
Institute of Electrical and Electronics Engineers Inc.

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.