File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

An Energy-Efficient Compute-in/near-Memory eDRAM Processor for Sparse Transformer-Based Large Language Models

Author(s)
An, Sun Hong
Advisor
Yoon, Heein
Issued Date
2026-02
URI
https://scholarworks.unist.ac.kr/handle/201301/90965 http://unist.dcollection.net/common/orgView/200000965198
Abstract
The transformer architecture has enabled large language models (LLMs) to improve a wide range of AI applications. A primary component, the multi-head self-attention mechanism, presents a major bottleneck due to its extensive computational and memory bandwidth requirements. While recent approaches using into sparse attention and attention formula reordering address these challenges, efficient LLM processing remains a key bottleneck for existing hardware. This thesis proposes an energy-efficient compute-in/near-memory (CINM) processor using eDRAM, to mitigate these bottlenecks through three key features. First, an attention block fusion computation strategy is employed to maximize data reuse within the attention map. This approach yields an 85.86% reduction in external memory access and achieves hardware utilization to 86.1%. Second, a CINM architecture resolves the imbalance between memory and computation, which, combined with a heterogeneous pipeline, achieves a 77.27% reduction in system latency. Third, a compute-in-memory array supporting the cross- read operation eliminates data direction conflicts, resulting in a 98.44% latency reduction. Furthermore, this array utilizes dual-row computation with reduced adder logic to improve energy efficiency by 1.58×. The processor, designed in 28nm CMOS technology, achieves 36.28–58.05 TOPS/W and demonstrates an F1 score of 92.41% on the SQuAD 1.1v benchmark using the BigBird-large model.
Publisher
Ulsan National Institute of Science and Technology
Degree
Master
Major
Department of Electrical Engineering

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.