File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

이종은

Lee, Jongeun
Intelligent Computing and Codesign Lab.
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

FlexInt: A New Number Format for Robust Sub-8-Bit Neural Network Inference

Author(s)
Hong, MinukSim, HyeonukLee, SugilLee, Jongeun
Issued Date
2024-10-27
URI
https://scholarworks.unist.ac.kr/handle/201301/85859
Citation
IEEE/ACM International Conference on Computer- Aided Design
Abstract
While previous work has demonstrated that even large DNNs
can be quantized to very low precision (sub-8-bit integers), concerns
over robustness across different types of networks and datasets have led
to a more serious consideration of fl oating-point (FP) formats in the
industry. However, at 8 bits and below, there is no universally accepted
FP format or one that provides robust performance on diverse data
distributions. Thus in this paper, based on our analysis of integer (INT)
and FP formats, we propose a novel number format called FlexInt,
with a high dynamic range similar to FP, yet low max rounding error,
targeting effi cient representation of DNNs for inference at 8 bits and
below. We also propose a novel FlexInt MAC (Multiply-Accumulate)
hardware architecture. Our experimental results using large networks on
image classifi cation and natural language processing demonstrate that our
FlexInt can deliver more robust performance and far superior worst-case
accuracy, compared to both INT and FP across various data distributions;
has a hardware overhead similar to that of FP; and can consistently make
near-Pareto-optimal area-accuracy trade-offs across diverse networks.
Publisher
IEEE/ACM

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.