File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

황랑기

Hwang, Ranggi
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Full metadata record

DC Field Value Language
dc.citation.endPage 16 -
dc.citation.number 1 -
dc.citation.startPage 13 -
dc.citation.title IEEE COMPUTER ARCHITECTURE LETTERS -
dc.citation.volume 22 -
dc.contributor.author Lee, Seonho -
dc.contributor.author Hwang, Ranggi -
dc.contributor.author Park, Jongse -
dc.contributor.author Rhu, Minsoo -
dc.date.accessioned 2025-09-25T13:30:01Z -
dc.date.available 2025-09-25T13:30:01Z -
dc.date.created 2025-09-25 -
dc.date.issued 2023-01 -
dc.description.abstract The recent advancement of the natural language processing (NLP) models is the result of the ever-increasing model size and datasets. Most of these modern NLP models adopt the Transformer based model architecture, whose main bottleneck is exhibited in the self-attention mechanism. As the computation required for self-attention increases rapidly as the model size gets larger, self-attentions have been the main challenge for deploying NLP models. Consequently, there are several prior works which sought to address this bottleneck, but most of them suffer from significant design overheads and additional training requirements. In this work, we propose HAMMER, hardware-friendly approximate computing solution for self-attentions employing mean-redistribution and linearization, which effectively increases the performance of self-attention mechanism with low overheads. Compared to previous state-of-the-art self-attention accelerators, HAMMER improves performance by 1.2-1.6x and energy efficiency by 1.2-1.5x. -
dc.identifier.bibliographicCitation IEEE COMPUTER ARCHITECTURE LETTERS, v.22, no.1, pp.13 - 16 -
dc.identifier.doi 10.1109/LCA.2022.3233832 -
dc.identifier.issn 1556-6056 -
dc.identifier.scopusid 2-s2.0-85147207144 -
dc.identifier.uri https://scholarworks.unist.ac.kr/handle/201301/88086 -
dc.identifier.wosid 000932425700001 -
dc.language 영어 -
dc.publisher IEEE COMPUTER SOC -
dc.title HAMMER: Hardware-Friendly Approximate Computing for Self-Attention With Mean-Redistribution And Linearization -
dc.type Article -
dc.description.isOpenAccess FALSE -
dc.relation.journalWebOfScienceCategory Computer Science, Hardware & Architecture -
dc.relation.journalResearchArea Computer Science -
dc.type.docType Article -
dc.description.journalRegisteredClass scie -
dc.description.journalRegisteredClass scopus -
dc.subject.keywordAuthor transformers -
dc.subject.keywordAuthor Approximate computing -
dc.subject.keywordAuthor hardware accelerator -
dc.subject.keywordAuthor neural network -
dc.subject.keywordAuthor sparse computation -

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.