File Download

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

이슬기

Lee, Seulki
Embedded Artificial Intelligence Lab.
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Full metadata record

DC Field Value Language
dc.citation.number 3 -
dc.citation.startPage 91 -
dc.citation.title INTERNATIONAL JOURNAL OF COMPUTER VISION -
dc.citation.volume 134 -
dc.contributor.author Park, Yoel -
dc.contributor.author Lee, Jaewook -
dc.contributor.author Lee, Seulki -
dc.date.accessioned 2026-03-05T14:32:47Z -
dc.date.available 2026-03-05T14:32:47Z -
dc.date.created 2026-02-23 -
dc.date.issued 2026-01 -
dc.description.abstract In this paper, we introduce a memory-efficient CNN (convolutional neural network), which enables resource-constrained low-end embedded and IoT devices to perform on-device vision and audio tasks, such as image classification, object detection, and audio classification, using extremely low memory, i.e., only 63 KB on ImageNet classification. Based on the bottleneck block of MobileNet, we propose three design principles that significantly curtail the peak memory usage of a CNN so that it can fit the limited KB memory of the low-end device. First, 'input segmentation' divides an input image into a set of patches, including the central patch overlapped with the others, reducing the size (and memory requirement) of a large input image. Second, 'patch tunneling' builds independent tunnel-like paths consisting of multiple bottleneck blocks per patch, penetrating through the entire model from an input patch to the last layer of the network, maintaining lightweight memory usage throughout the whole network. Lastly, 'bottleneck reordering' rearranges the execution order of convolution operations inside the bottleneck block such that the memory usage remains constant regardless of the size of the convolution output channels. We also present 'peak memory aware quantization', enabling desired peak memory reduction in actual deployment of quantized network. The experiment result shows that the proposed network classifies ImageNet with extremely low memory (i.e., 63 KB) while achieving competitive top-1 accuracy (i.e., 61.58%). To the best of our knowledge, the memory usage of the proposed network is far smaller than state-of-the-art memory-efficient networks, i.e., up to 89x and 3.1x smaller than MobileNet (i.e., 5.6 MB) and MCUNet (i.e., 196 KB), respectively. -
dc.identifier.bibliographicCitation INTERNATIONAL JOURNAL OF COMPUTER VISION, v.134, no.3, pp.91 -
dc.identifier.doi 10.1007/s11263-025-02688-w -
dc.identifier.issn 0920-5691 -
dc.identifier.uri https://scholarworks.unist.ac.kr/handle/201301/90570 -
dc.identifier.wosid 001674370500007 -
dc.language 영어 -
dc.publisher SPRINGER -
dc.title Designing Extremely Memory-Efficient CNNs for On-device Vision and Audio Tasks -
dc.type Article -
dc.description.isOpenAccess TRUE -
dc.relation.journalWebOfScienceCategory Computer Science, Artificial Intelligence -
dc.relation.journalResearchArea Computer Science -
dc.type.docType Article -
dc.description.journalRegisteredClass scie -
dc.description.journalRegisteredClass scopus -
dc.subject.keywordAuthor Peak memory reduction -
dc.subject.keywordAuthor Image classification -
dc.subject.keywordAuthor Object detection -
dc.subject.keywordAuthor Audio classification -
dc.subject.keywordAuthor On-device CNN -

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.