Exact indexing for support vector machines

Yu, H.; Ko, I.; Kim, Youngdae; Hwang, S.; Han, W.-S.

doi:10.1145/1989323.1989398

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Related Researcher

김영대

Kim, Youngdae

Read More

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Exact indexing for support vector machines

Author(s): Yu, H., Ko, I., Kim, Youngdae, Hwang, S., Han, W.-S.

Issued Date: 2011-06-12

DOI: 10.1145/1989323.1989398

URI: https://scholarworks.unist.ac.kr/handle/201301/83438

Citation: 2011 ACM SIGMOD and 30th PODS 2011 Conference, pp.709 - 720

Abstract: SVM (Support Vector Machine) is a well-established machine learning methodology popularly used for classification, regression, and ranking. Recently SVM has been actively researched for rank learning and applied to various applications including search engines or relevance feedback systems. A query in such systems is the ranking function F learned by SVM. Once learning a function F or formulating the query, processing the query to find top-k results requires evaluating the entire database by F. So far, there exists no exact indexing solution for SVM functions. Existing top-k query processing algorithms are not applicable to the machine-learned ranking functions, as they often make restrictive assumptions on the query, such as linearity or monotonicity of functions. Existing metric-based or reference-based indexing methods are also not applicable, because data points are invisible in the kernel space (SVM feature space) on which the index must be built. Existing kernel indexing methods return approximate results or fix kernel parameters. This paper proposes an exact indexing solution for SVM functions with varying kernel parameters. We first propose key geometric properties of the kernel space - ranking instability and ordering stability - which is crucial for building indices in the kernel space. Based on them, we develop an index structure iKernel and processing algorithms. We then present clustering techniques in the kernel space to enhance the pruning effectiveness of the index. According to our experiments, iKernel is highly effective overall producing 1∼5% of evaluation ratio on large data sets. According to our best knowledge, iKernel is the first indexing solution that finds exact top-k results of SVM functions without a full scan of data set. © 2011 ACM.

Publisher: Association for Computing Machinery

ISSN: 0730-8078

Show Full Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.