File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

김지수

Kim, Gi-Soo
Statistical Decision Making
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Heavy-tailed Linear Bandit with Huber Regression

Author(s)
Kang, MinhyunKim, Gi-Soo
Issued Date
2023-07-31
URI
https://scholarworks.unist.ac.kr/handle/201301/67923
Fulltext
https://proceedings.mlr.press/v216/kang23a.html
Citation
Conference on Uncertainty in Artificial Intelligence, pp.1027 - 1036
Abstract
Linear bandit algorithms have been extensively studied and have shown successful in sequential decision tasks despite their simplicity. Many algorithms however work under the assumption that the reward is the sum of linear function of observed contexts and a sub-Gaussian error. In practical applications, errors can be heavy-tailed, especially in financial data. In such reward environments, algorithms designed for sub-Gaussian error may underexplore, resulting in suboptimal regret. In this paper, we relax the reward assumption and propose a novel linear bandit algorithm which works well under heavy-tailed errors as well. The proposed algorithm utilizes Huber regression. When contexts are stochastic with positive definite covariance matrix and the (1 + δ)-th moment of the error is bounded by a constant, we show that the high-probability upper bound of the regret is O(√dT 1+ 1 δ (log dT) 1+ δ δ ), where d is the dimension of context variables, T is the time horizon, and δ ∈ (0, 1]. This bound improves on the state-of-the-art regret bound of the Median of Means and Truncation algorithm by a factor of √log T and √d for the case where the time horizon T is unknown. We also remark that when δ = 1, the order is the same as the regret bound of linear bandit algorithms designed for sub-Gaussian errors. We support our theoretical findings with synthetic experiments.
Publisher
ML Research Press
ISSN
2640-3498

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.