File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

김영근

Kim, Younggeun
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Valid oversampling schemes to handle imbalance

Author(s)
Kim, Young-geunKwon, YongchanPaik, Myunghee Cho
Issued Date
2019-07
DOI
10.1016/j.patrec.2019.07.006
URI
https://scholarworks.unist.ac.kr/handle/201301/90577
Fulltext
https://www.sciencedirect.com/science/article/pii/S0167865519301965?via%3Dihub
Citation
PATTERN RECOGNITION LETTERS, v.125, pp.661 - 667
Abstract
An imbalance is one of the problems in machine learning. When data are not balanced, the correct specification rate for the minor class suffers even if accuracy is high. The oversampling method has been used to address the issue without consideration about the sacrifice of accuracy. In addition, an arbitrary oversampling scheme may introduce bias. In this paper, we propose principled methods of handling imbalance under user-specified constraints on the sensitivity and specificity. Our work consists of three elements of contributions. First, we provide an optimized target proportion that minimizes the maximum error rate under user-specified constraints on sensitivity and specificity. Second, we introduce the notion of resampling at random (RAR) under which the limit of the estimator is not altered from the original sample. These two elements are relevant to any classification methods. Third, we derive asymptotic properties of selected classifiers when we apply RAR oversampling with the target proportion. Finally, we implement the proposed method in an image recognition context using the extracted feature from the last layer of deep convolutional neural networks (CNNs). We present an analysis of fundus data to classify diabetic retinopathy using the proposed method. (C) 2019 Elsevier B.V. All rights reserved.
Publisher
ELSEVIER
ISSN
0167-8655
Keyword (Author)
Optimal oversampling target proportionResampling at randomMedical imagingImbalanceOversampling
Keyword
CLASSIFICATION

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.