Anytime many-armed bandits
17
0
0
Texte intégral
Documents relatifs
In the finite horizon case or with a discounted cost and an infinite horizon, we show that when the number of particles becomes large, the optimal cost of the system converges
τ may be as large as N, so that the number of bisection steps, each of complexity O(N), may be arbitrarily large. Finally, we note that projection onto the simplex is a particular