File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Full metadata record

DC Field Value Language
dc.contributor.advisor Yoon, Heein -
dc.contributor.author An, Sun Hong -
dc.date.accessioned 2026-03-26T22:14:01Z -
dc.date.available 2026-03-26T22:14:01Z -
dc.date.issued 2026-02 -
dc.description.abstract The transformer architecture has enabled large language models (LLMs) to improve a wide range of AI applications. A primary component, the multi-head self-attention mechanism, presents a major bottleneck due to its extensive computational and memory bandwidth requirements. While recent approaches using into sparse attention and attention formula reordering address these challenges, efficient LLM processing remains a key bottleneck for existing hardware. This thesis proposes an energy-efficient compute-in/near-memory (CINM) processor using eDRAM, to mitigate these bottlenecks through three key features. First, an attention block fusion computation strategy is employed to maximize data reuse within the attention map. This approach yields an 85.86% reduction in external memory access and achieves hardware utilization to 86.1%. Second, a CINM architecture resolves the imbalance between memory and computation, which, combined with a heterogeneous pipeline, achieves a 77.27% reduction in system latency. Third, a compute-in-memory array supporting the cross- read operation eliminates data direction conflicts, resulting in a 98.44% latency reduction. Furthermore, this array utilizes dual-row computation with reduced adder logic to improve energy efficiency by 1.58×. The processor, designed in 28nm CMOS technology, achieves 36.28–58.05 TOPS/W and demonstrates an F1 score of 92.41% on the SQuAD 1.1v benchmark using the BigBird-large model. -
dc.description.degree Master -
dc.description Department of Electrical Engineering -
dc.identifier.uri https://scholarworks.unist.ac.kr/handle/201301/90965 -
dc.identifier.uri http://unist.dcollection.net/common/orgView/200000965198 -
dc.language ENG -
dc.publisher Ulsan National Institute of Science and Technology -
dc.subject MAX, MXene, Electrochemical energy applications -
dc.title An Energy-Efficient Compute-in/near-Memory eDRAM Processor for Sparse Transformer-Based Large Language Models -
dc.type Thesis -

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.