File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

나승훈

Na, Seung-Hoon
Natural Language Processing Lab
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Full metadata record

DC Field Value Language
dc.citation.endPage 2677 -
dc.citation.number 10 -
dc.citation.startPage 2670 -
dc.citation.title IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS -
dc.citation.volume E89D -
dc.contributor.author Kang, In-Su -
dc.contributor.author Na, Seung-Hoon -
dc.contributor.author Lee, Jong-Hyeok -
dc.date.accessioned 2025-04-25T15:14:10Z -
dc.date.available 2025-04-25T15:14:10Z -
dc.date.created 2025-04-08 -
dc.date.issued 2006-10 -
dc.description.abstract Compound noun segmentation is a key component for Korean language processing. Supervised approaches require some types of human intervention such as maintaining lexicons, manually segmenting the corpora, or devising heuristic rules. Thus, they suffer from the unknown word problem, and cannot distinguish domain-oriented or corpus-directed segmentation results from the others. These problems can be overcome by unsupervised approaches that employ segmentation clues obtained purely from a raw corpus. However, most unsupervised approaches require tuning of empirical parameters or learning of the statistical dictionary. To develop a tuning-less, learning-free unsupervised segmentation algorithm, this study proposes a pruning-based unsupervised technique that eliminates unhelpful segmentation candidates. In addition, unlike previous unsupervised methods that have relied on purely character-based segmentation clues, this study utilizes word-based segmentation clues. Experimental evaluations show that the pruning scheme is very effective to unsupervised segmentation of Korean compound nouns, and the use of word-based prior knowledge enables better segmentation accuracy. This study also shows that the proposed algorithm performs competitively with or better than other unsupervised methods. -
dc.identifier.bibliographicCitation IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, v.E89D, no.10, pp.2670 - 2677 -
dc.identifier.doi 10.1093/ietisy/e89-d.10.2670 -
dc.identifier.issn 0916-8532 -
dc.identifier.scopusid 2-s2.0-33750067703 -
dc.identifier.uri https://scholarworks.unist.ac.kr/handle/201301/86846 -
dc.identifier.wosid 000241296100010 -
dc.language 영어 -
dc.publisher IEICE-INST ELECTRONICS INFORMATION COMMUNICATIONS ENG -
dc.title Pruning-based unsupervised segmentation for Korean -
dc.type Article -
dc.description.isOpenAccess FALSE -
dc.relation.journalWebOfScienceCategory Computer Science, Information Systems; Computer Science, Software Engineering -
dc.relation.journalResearchArea Computer Science -
dc.type.docType Article -
dc.description.journalRegisteredClass scie -
dc.description.journalRegisteredClass scopus -
dc.subject.keywordAuthor unsupervised method -
dc.subject.keywordAuthor pruning technique -
dc.subject.keywordAuthor segmentation evaluation -
dc.subject.keywordAuthor compound noun segmentation -

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.