File Download

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Full metadata record

DC Field Value Language
dc.contributor.advisor Kim, Gi-Soo -
dc.contributor.author Koh, Yeongjin -
dc.date.accessioned 2025-04-04T13:49:51Z -
dc.date.available 2025-04-04T13:49:51Z -
dc.date.issued 2025-02 -
dc.description.abstract The bandit algorithm is a reinforcement learning approach where an agent sequentially selects one of several actions in a given environment, observes the reward from that choice, and optimizes its policy to maximize cumulative rewards (or equivalently, minimize regret). When choosing among several ac- tions, contextual information associated with each action can be valuable, particularly when the reward is correlated with the contextual features. However, when certain contextual features are missing, utiliz- ing such data to choose actions and learn the reward model becomes challenging, thereby limiting the algorithm’s effectiveness. This study proposes an enhanced bandit algorithm that imputes missing contextual features using statistical estimates under the Missing at Random (MAR) assumption. This approach reduces data loss and enables the use of more complete contextual information. To achieve this, we extend the GLOC (Generalized Linear Online-to-confidence-set Conversion) framework to effectively handle missing co- variates. By integrating imputation techniques, the proposed method complements missing contextual information for each action, improving the model’s performance while preserving the strengths of the GLOC framework. The proposed model, termed the Imputation-enhanced Generalized Linear Bandit (IGLB), efficiently utilizes available information to address the challenges posed by missing contexts. We evaluate the per- formance of IGLB through experiments on both synthetic datasets and the real-world Warfarin dataset, comparing its results with existing methods. Experimental results demonstrate that IGLB achieves im- proved predictive performance, highlighting the benefits of leveraging imputed context for more com- prehensive decision-making. -
dc.description.degree Master -
dc.description Graduate School of Artificial Intelligence -
dc.identifier.uri https://scholarworks.unist.ac.kr/handle/201301/86495 -
dc.identifier.uri http://unist.dcollection.net/common/orgView/200000865954 -
dc.language ENG -
dc.publisher Ulsan National Institute of Science and Technology -
dc.subject Bandit Algorithm -
dc.subject Statistical Learning -
dc.subject Statistic Inference -
dc.subject Missing -
dc.subject Missing Covariates -
dc.subject Missing Imputation -
dc.title Generalized Linear Bandit with Missing Covariates -
dc.type Thesis -

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.