Maximizing and Satisficing and Multi-armed Bandits with Graph Information

Parth Thaker, Nikhil Rao, Mohit Malu, Gautam Dasarathy

November 2022

PDF Code Poster

Abstract

We study pure exploration in multi armed bandits with graph side information. In particular, we consider the best-arm and near best-arm identification problem in the fixed confidence setting under the assumption that the arm rewards are smooth with respect to a given arbitrary graph. This captures a range of real world pure exploration scenarios where one often has information about the similarity of the options or actions under consideration. We propose a novel algorithm GRUB for this problem and provide a theoretical characterization of its performance that elicits the benefit of the graph side information. We complement our theory with experimental results that show that capitalizing on available graph side information yields significant improvements over pure exploration methods that are unable to use this information

Type

Conference paper

Publication

Neural Information Processing Systems, 2022