Inproceedings,

Randomized Exploration in Generalized Linear Bandits

B. Kveton, M. and Zaheer, {. Szepesvári, L. Li, M. Ghavamzadeh, and C. Boutilier.
AISTATS, (March 2020)

Abstract

We study two randomized algorithms for generalized linear bandits, GLM-TSL and GLM-FPL. GLM-TSL samples a generalized linear model (GLM) from the Laplace approximation to the posterior distribution. GLM-FPL fits a GLM to a randomly perturbed history of past rewards. We prove C d (n log K)^(1/2) bounds (up to log factors) on the n-round regret of GLM-TSL and GLM-FPL, where d is the number of features and K is the number of arms. The regret bound of GLM-TSL improves upon prior work and the regret bound of GLM-FPL is the first of its kind. We apply both GLM-TSL and GLM-FPL to logistic and neural network bandits, and show that they perform well empirically. In more complex models, GLM-FPL is significantly faster. Our results showcase the role of randomization, beyond sampling from the posterior, in exploration.

BibTeX key: KZSZLGB20
entry type: inproceedings
booktitle: AISTATS
year: 2020
month: March
date-added: 2020-03-07 14:31:44 -0700
bdsk-url-1: http://proceedings.mlr.press/v54/hanawal17a.html
pdf: papers/AISTATS2020-GLB.pdf
date-modified: 2020-03-07 14:56:07 -0700
url: https://arxiv.org/abs/1906.08947

BibSonomy

Randomized Exploration in Generalized Linear Bandits

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on