Weight Separation for Memory-Efficient and Accurate Deep Multitask Learning

Lee, Seulki; Shahriar Nirjon

doi:10.1109/PerCom53586.2022.9762400

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Related Researcher

이슬기

Lee, Seulki: Embedded Artificial Intelligence Lab.

Read More

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Weight Separation for Memory-Efficient and Accurate Deep Multitask Learning

Author(s): Lee, Seulki, Shahriar Nirjon

Issued Date: 2022-03-22

DOI: 10.1109/PerCom53586.2022.9762400

URI: https://scholarworks.unist.ac.kr/handle/201301/76291

Citation: IEEE International Conference on Pervasive Computing and Communications, pp.13 - 22

Abstract: We propose a new concept called Weight Separation of deep neural networks (DNNs), which enables memory-efficient and accurate deep multitask learning on a memory-constrained embedded system. The goal of weight separation is to achieve extreme packing of multiple heterogeneous DNNs into the limited memory of the system while ensuring the prediction accuracy of the constituent DNNs at the same time. The proposed approach separates the DNN weights into two types of weight-pages consisting of a subset of weight parameters, i.e., shared and exclusive weight-pages. It optimally distributes the weight-pages into two levels of the system memory hierarchy and stores them separately, i.e., the shared weight-pages in primary (level-1) memory (e.g., RAM) and the exclusive weight-pages in secondary (level-2) memory (e.g., flask disk or SSD). First, to reduce the memory usage of multiple DNNs, less critical weight parameters are identified and overlapped onto the shared weight-pages that are deployed in the limited space of the primary (main) memory. Next, to retain the prediction accuracy of multiple DNNs, the essential weight parameters that play a critical role in preserving prediction accuracy are stored intact in the plentiful space of secondary memory storage in the form of exclusive weight-pages without overlapping. We implement two real systems applying the proposed weight separation: 1) a microcontroller-based multitask IoT system that performs multitask learning of 10 scaled-down DNNs by separating the weight parameters into FRAM and flash disk, and 2) an embedded GPU system that performs multitask learning of 10 state-of-the-art DNNs, separating the weight parameters into GPU RAM and eMMC. Our evaluation shows that memory efficiency, prediction accuracy, and execution time of deep multitask learning improve up to 5.9x, 2.0%, and 13.1x, respectively, without any modification of DNN models.

Publisher: Institute of Electrical and Electronics Engineers

ISSN: 2474-2503

Show Full Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1404 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.