24th Asia and South Pacific Design Automation Conference, ASPDAC 2019, pp.280 - 285
Abstract
For energy efficiency, many low-bit quantization methods for deep neural networks (DNNs) have been proposed. Among them, logarithmic quantization is being highlighted showing acceptable deep learning performance. It also simplifies high-cost multipliers as well as reducing memory footprint drastically. Meanwhile, stochastic computing (SC) was proposed for low-cost DNN acceleration and the recently proposed SC multiplier improved the accuracy and latency significantly which are main drawbacks of SC. However, in their binary-interfaced system which yet costs much less than storing all stochastic stream, quantization is basically linear as same as conventional fixed-point binary. We applied logarithmically quantized DNNs to the state-of-the-art SC multiplier and studied how it can benefit. We found that SC multiplication on logarithmically quantized input is more accurate and it can help fine-tuning process. Furthermore, we designed the much low-cost SC-DNN accelerator utilizing the reduced complexity of inputs. Finally, while logarithmic quantization benefits data flow, proposed architecture achieves 40% and 24% less area and power consumption than the previous SC-DNN accelerator. Its area × latency product is smaller even than the shifter based accelerator.