File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

김지수

Kim, Gi-Soo
Statistical Decision Making
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Semi-parametric contextual bandits with graph-Laplacian regularization

Author(s)
Choi, Young-GeunKim, Gi-SooPaik, SeunghoonPaik, Myunghee Cho
Issued Date
2023-10
DOI
10.1016/j.ins.2023.119367
URI
https://scholarworks.unist.ac.kr/handle/201301/65134
Citation
INFORMATION SCIENCES, v.645, pp.119367
Abstract
Non-stationarity is ubiquitous in human behavior and addressing it in the contextual bandits is challenging. Several works have addressed the problem by investigating semi-parametric contextual bandits and warned that ignoring non-stationarity could harm performances. Another prevalent human behavior is social interaction which has become available in a form of a social network or graph structure. As a result, graph-based contextual bandits have received much attention. In this paper, we propose SemiGraphTS, a novel contextual Thompson-sampling algorithm for a graph-based semi-parametric reward model. Our algorithm is the first to be proposed in this setting. We derive an upper bound of the cumulative regret that can be expressed as a multiple of a factor depending on the graph structure and the order for the semi-parametric model without a graph. We evaluate the proposed and existing algorithms via simulation and real data example.
Publisher
ELSEVIER SCIENCE INC
ISSN
0020-0255
Keyword (Author)
Contextual multi-armed banditGraph LaplacianSemi-parametric reward modelThompson sampling

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.