File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

SENOCAKARDA

Senocak, Arda
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Full metadata record

DC Field Value Language
dc.citation.endPage 2979 -
dc.citation.startPage 2975 -
dc.citation.title IEEE SIGNAL PROCESSING LETTERS -
dc.citation.volume 31 -
dc.contributor.author Erol, Mehmet Hamza -
dc.contributor.author Senocak, Arda -
dc.contributor.author Feng, Jiu -
dc.contributor.author Chung, Joon Son -
dc.date.accessioned 2025-09-03T14:00:01Z -
dc.date.available 2025-09-03T14:00:01Z -
dc.date.created 2025-09-03 -
dc.date.issued 2024-10 -
dc.description.abstract Transformers have rapidly become the preferred choice for audio classification, surpassing methods based on CNNs. However, Audio Spectrogram Transformers (ASTs) exhibit quadratic scaling due to self-attention. The removal of this quadratic self-attention cost presents an appealing direction. Recently, state space models (SSMs), such as Mamba, have demonstrated potential in language and vision tasks in this regard. In this study, we explore whether reliance on self-attention is necessary for audio classification tasks. By introducing Audio Mamba (AuM), the first self-attention-free, purely SSM-based model for audio classification, we aim to address this question. We evaluate AuM on various audio datasets - comprising six different benchmarks - where it achieves comparable or better performance compared to well-established AST model. -
dc.identifier.bibliographicCitation IEEE SIGNAL PROCESSING LETTERS, v.31, pp.2975 - 2979 -
dc.identifier.doi 10.1109/LSP.2024.3483009 -
dc.identifier.issn 1070-9908 -
dc.identifier.scopusid 2-s2.0-85207707217 -
dc.identifier.uri https://scholarworks.unist.ac.kr/handle/201301/87863 -
dc.identifier.wosid 001346118800003 -
dc.language 영어 -
dc.publisher IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC -
dc.title Audio Mamba: Bidirectional State Space Model for Audio Representation Learning -
dc.type Article -
dc.description.isOpenAccess FALSE -
dc.relation.journalWebOfScienceCategory Engineering, Electrical & Electronic -
dc.relation.journalResearchArea Engineering -
dc.type.docType Article -
dc.description.journalRegisteredClass scie -
dc.description.journalRegisteredClass scopus -
dc.subject.keywordAuthor Transformers -
dc.subject.keywordAuthor Spectrogram -
dc.subject.keywordAuthor Computational modeling -
dc.subject.keywordAuthor Training -
dc.subject.keywordAuthor Context modeling -
dc.subject.keywordAuthor Adaptation models -
dc.subject.keywordAuthor Standards -
dc.subject.keywordAuthor Graphics processing units -
dc.subject.keywordAuthor Feature extraction -
dc.subject.keywordAuthor Complexity theory -
dc.subject.keywordAuthor Audio classification -
dc.subject.keywordAuthor audio spectrogram transformers -
dc.subject.keywordAuthor state space models -

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.