File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

이슬기

Lee, Seulki
Embedded Artificial Intelligence Lab.
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Weight Separation for Memory-Efficient and Accurate Deep Multitask Learning

Author(s)
Lee, SeulkiShahriar Nirjon
Issued Date
2022-03-22
DOI
10.1109/PerCom53586.2022.9762400
URI
https://scholarworks.unist.ac.kr/handle/201301/76291
Citation
IEEE International Conference on Pervasive Computing and Communications, pp.13 - 22
Abstract
We propose a new concept called Weight Separation of deep neural networks (DNNs), which enables memory-efficient and accurate deep multitask learning on a memory-constrained embedded system. The goal of weight separation is to achieve extreme packing of multiple heterogeneous DNNs into the limited memory of the system while ensuring the prediction accuracy of the constituent DNNs at the same time. The proposed approach separates the DNN weights into two types of weight-pages consisting of a subset of weight parameters, i.e., shared and exclusive weight-pages. It optimally distributes the weight-pages into two levels of the system memory hierarchy and stores them separately, i.e., the shared weight-pages in primary (level-1) memory (e.g., RAM) and the exclusive weight-pages in secondary (level-2) memory (e.g., flask disk or SSD). First, to reduce the memory usage of multiple DNNs, less critical weight parameters are identified and overlapped onto the shared weight-pages that are deployed in the limited space of the primary (main) memory. Next, to retain the prediction accuracy of multiple DNNs, the essential weight parameters that play a critical role in preserving prediction accuracy are stored intact in the plentiful space of secondary memory storage in the form of exclusive weight-pages without overlapping. We implement two real systems applying the proposed weight separation: 1) a microcontroller-based multitask IoT system that performs multitask learning of 10 scaled-down DNNs by separating the weight parameters into FRAM and flash disk, and 2) an embedded GPU system that performs multitask learning of 10 state-of-the-art DNNs, separating the weight parameters into GPU RAM and eMMC. Our evaluation shows that memory efficiency, prediction accuracy, and execution time of deep multitask learning improve up to 5.9x, 2.0%, and 13.1x, respectively, without any modification of DNN models.
Publisher
Institute of Electrical and Electronics Engineers
ISSN
2474-2503

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.