IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS
Abstract
Convolutional Neural Networks (CNNs) achieve remarkable accuracy in vision tasks, yet their computational complexity challenges low-power edge deployment. In this work, we present COMET, a framework of CNN models that employ efficient hardware offset-binary coding (OBC) techniques to enable co-optimization of performance and resource utilization. The approach formulates CNN inference using OBC representations applied separately to inputs (Scheme A) and weights (Scheme B), enabling exploitation of bit-width asymmetry. The shift-accumulate operation is modified by incorporating offset-term with the pre-scaled bias. Leveraging symmetries in Schemes A and B, we introduce four look-up table (LUT) techniques-parallel, shared, split, and hybrid-and evaluate their efficiency. Building on this foundation, we develop a general matrix multiplication core using the im2col transformation for efficient CNN acceleration. We consider LeNet-5 and All-CNN-C to demonstrate that the OBC-GEMM core efficiently supports modern workloads. Evaluation shows that COMET enables efficient FPGA deployment compared to state-of-the-art designs, with negligible accuracy loss, demonstrating its efficiency and scalability across diverse network architectures.