[PDF] Top 20 On the Use of Non-Stationary Policies for Infinite-Horizon Discounted Markov Decision Processes

On the Use of Non-Stationary Policies for Infinite-Horizon Discounted Markov Decision Processes

... in the usual situation when γ is close to ...consequence of this result is that the problem of “computing an approximately optimal non-stationary policy” is much simpler than ... Voir le document complet

5

Algorithmic aspects of mean–variance optimization in Markov decision processes

... in the literature. For example, (Guo, Ye, & Yin, 2012) consider a mean-variance optimization problem, but subject to a constraint on the vector of expected rewards starting from each ... Voir le document complet

26

Lexicographic refinements in possibilistic decision trees and finite-horizon Markov decision processes

... perspective of this work, not unrelated, is to develop simulation-based algorithms for finding lexicographic solutions to ...making use of simulated trajectories of states and ... Voir le document complet

26

Lexicographic refinements in stationary possibilistic Markov Decision Processes

... extension of lexicographic refinements to finite horizon possibilistic Markov decision processes and proposes a value iteration algorithm that looks for policies optimal ... Voir le document complet

22

Lexicographic refinements in possibilistic decision trees and finite-horizon Markov decision processes

... Possibilistic decision theory has been proposed twenty years ago and has had several extensions since ...pealing for its ability to handle qualitative decision problems, possibilistic decision ... Voir le document complet

27

Lightweight Verification of Markov Decision Processes with Rewards

... is the classic ‘sparse sampling algorithm’ for large, infinite horizon, discounted ...approximating the best action from a current state, using a stochastic depth-first ...search. ... Voir le document complet

16

Scalable Verification of Markov Decision Processes

... Work The Kearns algorithm [13] is the classic ‘sparse sampling algorithm’ for large, infinite horizon, discounted ...approximating the best action from a current state ... Voir le document complet

13

Limits of Multi-Discounted Markov Decision Processes

... Although the mean–payoff parity and the priority weighted function are both generalizations of parity and mean–payoff functions they have radically different prop- ...erties. The main ... Voir le document complet

13

Smart Sampling for Lightweight Verification of Markov Decision Processes

... ORK The classic algorithms to solve MDPs are ‘policy iteration’ and ‘value iteration’ ...algorithms for MDPs may use value iteration applied to probabilities [1, ...solve the same problem ... Voir le document complet

14

Efficient Policies for Stationary Possibilistic Markov Decision Processes

... as the infinite horizon case is concerned, other types of lexicographic refinements could be ...One of these options could be to avoid the duplication of the set ... Voir le document complet

12

On the link between infinite horizon control and quasi-stationary distributions

... to the definition of the controlled non-linear branch- ing processes and to the proof of preliminary ...Using the criteria of [CV15a], we also state in ... Voir le document complet

31

Efficient Policies for Stationary Possibilistic Markov Decision Processes

... as the infinite horizon case is concerned, other types of lexicographic refinements could be ...One of these options could be to avoid the duplication of the set ... Voir le document complet

11

Exact aggregation of absorbing Markov processes using quasi-stationary distribution

... characterize the conditions under which an absorbing Markovian finite process (in dis- crete or continuous time) can be transformed into a new aggregated process conserving the Markovian property, whose ... Voir le document complet

10

A Stochastic Minimax Optimal Control Problem on Markov Chains with Infinite Horizon

... Unité de recherche INRIA Lorraine, Technopôle de Nancy-Brabois, Campus scientifique, ` NANCY 615 rue du Jardin Botanique, BP 101, 54600 VILLERS LES Unité de recherche INRIA Rennes, Ir[r] ... Voir le document complet

22

Collision Avoidance for Unmanned Aircraft using Markov Decision Processes

... Instead of hand-crafting a collision avoidance algorithm for every combination of sensor and aircraft configuration, we investigate the automatic generation of collision avoidance ... Voir le document complet

23

Aggregating Optimistic Planning Trees for Solving Markov Decision Processes

... 3. The optimistic part of the algorithm allows a deep exploration of the ...At the same time, it biases the expression maximized by ˆ π in (4) towards near-optimal actions ... Voir le document complet

9

Approximate solution methods for partially observable Markov and semi-Markov decision processes

... how the local minimum was found, which also shows that the ap- proach of finite-state controller with policy gradient is quite effective for this ...problem. The initial policy has ... Voir le document complet

169

On Upper-Confidence Bound Policies for Non-Stationary Bandit Problems

... of playing a policy with the highest expected reward, and the regret grows as the logarithm of T ...bounds for the regret have been derived (see Auer et ...Though ... Voir le document complet

25

Incorporating Bayesian networks in Markov Decision Processes

... type for the first time period, in the case of using the BN, is i 2 , whereas it is i 3 that has a smaller SD (hence, it is costlier) for the case of using a ... Voir le document complet

11

DetH*: Approximate Hierarchical Solution of Large Markov Decision Processes

... lutions for large, factored Markov decision ...types of leverage to the problem: it shortens the horizon using an automatically generated temporal hierarchy and it reduces ... Voir le document complet

9