Finite Budget Analysis of Multi-Armed Bandit Problems----Institute of Automation

Finite Budget Analysis of Multi-Armed Bandit Problems

Jul 24, 2017Author：

PrintText Size A A

Title: Finite Budget Analysis of Multi-Armed Bandit Problems

Authors: Xia, YC; Qin, T; Ding, WK; Li, HF; Zhang, XD; Yu, NH; Liu, TY

Author Full Names: Xia, Yingce; Qin, Tao; Ding, Wenkui; Li, Haifang; Zhang, Xudong; Yu, Nenghai; Liu, Tie-Yan

Source: NEUROCOMPUTING, 258 13-29; SI 10.1016/j.neucom.2016.12.079 OCT 4 2017

Language: English

Abstract: In the budgeted multi-armed bandit (MAB) problem, a player receives a random reward and needs to pay a cost after pulling an arm, and he cannot pull any more arm after running out of budget. In this paper, we give an extensive study of the upper confidence bound based algorithms and a greedy algorithm for budgeted MABs. We perform theoretical analysis on the proposed algorithms, and show that they all enjoy sublinear regret bounds with respect to the budget B. Furthermore, by carefully choosing the parameters, they can even achieve log linear regret bounds. We also prove that the asymptotic lower bound for budgeted Bernoulli bandits is Omega(lnB). Our proof technique can be used to improve the theoretical results for fractional KUBE [26] and Budgeted Thompson Sampling [30]. (C) 2017 Elsevier B.V. All rights reserved.

ISSN: 0925-2312

eISSN: 1872-8286

IDS Number: EZ4SD

Unique ID: WOS:000404701900003

*Click Here to View Full Record

Journals & Publications

Papers