FlexInt: A New Number Format for Robust Sub-8-Bit Neural Network Inference

Scholarworks@UNIST

UNIST Library

There are no files associated with this item.

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Related Researcher

이종은

Read More

Cited time in webofscience

Cited time in scopus

Metadata Downloads

FlexInt: A New Number Format for Robust Sub-8-Bit Neural Network Inference

Abstract: While previous work has demonstrated that even large DNNs
can be quantized to very low precision (sub-8-bit integers), concerns
over robustness across different types of networks and datasets have led
to a more serious consideration of fl oating-point (FP) formats in the
industry. However, at 8 bits and below, there is no universally accepted
FP format or one that provides robust performance on diverse data
distributions. Thus in this paper, based on our analysis of integer (INT)
and FP formats, we propose a novel number format called FlexInt,
with a high dynamic range similar to FP, yet low max rounding error,
targeting effi cient representation of DNNs for inference at 8 bits and
below. We also propose a novel FlexInt MAC (Multiply-Accumulate)
hardware architecture. Our experimental results using large networks on
image classifi cation and natural language processing demonstrate that our
FlexInt can deliver more robust performance and far superior worst-case
accuracy, compared to both INT and FP across various data distributions;
has a hardware overhead similar to that of FP; and can consistently make
near-Pareto-optimal area-accuracy trade-offs across diverse networks.

qrcode

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.