Extending Neural Processing Unit and Compiler for Advanced Binarized Neural Networks

Scholarworks@UNIST

UNIST Library

There are no files associated with this item.

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Related Researcher

이종은

Read More

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Extending Neural Processing Unit and Compiler for Advanced Binarized Neural Networks

Citation: 29th Asia and South Pacific Design Automation Conference, ASP-DAC 2024, pp.115 - 120

Abstract: Binarized neural networks (BNNs) are one of the most promising approaches to deploy deep neural network models on resource-constrained devices. However, there is very little support on compilers and programmable accelerators for BNNs especially with the modern BNNs that use scale factors and skip connections to maximize network performance. In this paper we present a set of methods to extend a neural processing unit (NPU) and a compiler to support modern BNNs. Our novel ideas include (i) batch-norm folding for binarized layers with scale factors and skip connections, (ii) efficient handling of convolutions with few input channels, and (iii) bit-packing pipelining. Our evaluation using BiRealNet-18 on an FPGA board demonstrates that our compiler-architecture hybrid approach can yield significant speedups for binary convolution layers over the baseline NPU. Also our approach gives 3.65.5 × better end-to-end performance on BiRealNet-18 compared with previous BNN compiler approaches.

qrcode

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.