Semi-parametric contextual bandits with graph-Laplacian regularization

Choi, Young-Geun; Kim, Gi-Soo; Paik, Seunghoon; Paik, Myunghee Cho

doi:10.1016/j.ins.2023.119367

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Related Researcher

김지수

Kim, Gi-Soo: Statistical Decision Making

Read More

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.citation.startPage	119367	-
dc.citation.title	INFORMATION SCIENCES	-
dc.citation.volume	645	-
dc.contributor.author	Choi, Young-Geun	-
dc.contributor.author	Kim, Gi-Soo	-
dc.contributor.author	Paik, Seunghoon	-
dc.contributor.author	Paik, Myunghee Cho	-
dc.date.accessioned	2023-12-21T11:43:17Z	-
dc.date.available	2023-12-21T11:43:17Z	-
dc.date.created	2023-08-22	-
dc.date.issued	2023-10	-
dc.description.abstract	Non-stationarity is ubiquitous in human behavior and addressing it in the contextual bandits is challenging. Several works have addressed the problem by investigating semi-parametric contextual bandits and warned that ignoring non-stationarity could harm performances. Another prevalent human behavior is social interaction which has become available in a form of a social network or graph structure. As a result, graph-based contextual bandits have received much attention. In this paper, we propose SemiGraphTS, a novel contextual Thompson-sampling algorithm for a graph-based semi-parametric reward model. Our algorithm is the first to be proposed in this setting. We derive an upper bound of the cumulative regret that can be expressed as a multiple of a factor depending on the graph structure and the order for the semi-parametric model without a graph. We evaluate the proposed and existing algorithms via simulation and real data example.	-
dc.identifier.bibliographicCitation	INFORMATION SCIENCES, v.645, pp.119367	-
dc.identifier.doi	10.1016/j.ins.2023.119367	-
dc.identifier.issn	0020-0255	-
dc.identifier.scopusid	2-s2.0-85164237798	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/65134	-
dc.identifier.wosid	001036367700001	-
dc.language	영어	-
dc.publisher	ELSEVIER SCIENCE INC	-
dc.title	Semi-parametric contextual bandits with graph-Laplacian regularization	-
dc.type	Article	-
dc.description.isOpenAccess	FALSE	-
dc.relation.journalWebOfScienceCategory	Computer Science, Information Systems	-
dc.relation.journalResearchArea	Computer Science	-
dc.type.docType	Article	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordAuthor	Contextual multi-armed bandit	-
dc.subject.keywordAuthor	Graph Laplacian	-
dc.subject.keywordAuthor	Semi-parametric reward model	-
dc.subject.keywordAuthor	Thompson sampling	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1404 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.