File Download

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Generalized Linear Bandit with Missing Covariates

Author(s)
Koh, Yeongjin
Advisor
Kim, Gi-Soo
Issued Date
2025-02
URI
https://scholarworks.unist.ac.kr/handle/201301/86495 http://unist.dcollection.net/common/orgView/200000865954
Abstract
The bandit algorithm is a reinforcement learning approach where an agent sequentially selects one of several actions in a given environment, observes the reward from that choice, and optimizes its policy to maximize cumulative rewards (or equivalently, minimize regret). When choosing among several ac- tions, contextual information associated with each action can be valuable, particularly when the reward is correlated with the contextual features. However, when certain contextual features are missing, utiliz- ing such data to choose actions and learn the reward model becomes challenging, thereby limiting the algorithm’s effectiveness. This study proposes an enhanced bandit algorithm that imputes missing contextual features using statistical estimates under the Missing at Random (MAR) assumption. This approach reduces data loss and enables the use of more complete contextual information. To achieve this, we extend the GLOC (Generalized Linear Online-to-confidence-set Conversion) framework to effectively handle missing co- variates. By integrating imputation techniques, the proposed method complements missing contextual information for each action, improving the model’s performance while preserving the strengths of the GLOC framework. The proposed model, termed the Imputation-enhanced Generalized Linear Bandit (IGLB), efficiently utilizes available information to address the challenges posed by missing contexts. We evaluate the per- formance of IGLB through experiments on both synthetic datasets and the real-world Warfarin dataset, comparing its results with existing methods. Experimental results demonstrate that IGLB achieves im- proved predictive performance, highlighting the benefits of leveraging imputed context for more com- prehensive decision-making.
Publisher
Ulsan National Institute of Science and Technology
Degree
Master
Major
Graduate School of Artificial Intelligence

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.