[PDF] Top 20 Bounds for Markov Decision Processes

Bounds for Markov Decision Processes

... Lower bounds via martingale ...lower bounds, which constitutes an active area of research, relies on ‘information ...penalty for relaxing the restrictions on information available to the ... Voir le document complet

22

Efficient Policies for Stationary Possibilistic Markov Decision Processes

... Possibilistic Markov Decision Processes offer a compact and tractable way to represent and solve problems of sequential decision under qualitative un- ...appealing for its ability to ... Voir le document complet

12

Efficient Policies for Stationary Possibilistic Markov Decision Processes

... Keywords: Markov Decision process, Possibility theory, lexicographic compar- isons, possibilistic qualitative utilities 1 Introduction The classical paradigm for sequential decision making ... Voir le document complet

11

The steady-state control problem for Markov decision processes

... Labeled Markov decision processes The steady-state control problem for MDP describes random processes in which interactions with users or with the environment drive the system towards a ... Voir le document complet

17

Adaptive Planning for Markov Decision Processes with Uncertain Transition Models via Incremental Feature Dependency Discovery

... autonomy for unmanned aerial vehicles (UAVs) through planning algorithms is needed to tackle real-life missions such as persistent surveillance, maintaining wireless network connectivity, and search and rescue ... Voir le document complet

17

On the Use of Non-Stationary Policies for Infinite-Horizon Discounted Markov Decision Processes

... Abstract We consider infinite-horizon γ-discounted Markov Decision Processes, for which it is known that there exists a stationary optimal policy. We consider the algorithm Value Iteration and ... Voir le document complet

5

Collision Avoidance for Unmanned Aircraft using Markov Decision Processes

... a Markov Decision Process (MDP), or more generally as a Partially Observable Markov Decision Process (POMDP) to also account for observation ... Voir le document complet

23

Smart Sampling for Lightweight Verification of Markov Decision Processes

... In [15] the authors present an SMC algorithm to decide whether there exists a memoryless scheduler for a given MDP, such that the probability of a property is above a specified threshold. The algorithm has an ... Voir le document complet

14

Aggregating Optimistic Planning Trees for Solving Markov Decision Processes

... results for solving decision making ...algorithms for deterministic systems and stochastic systems [8, 9, 17], and global optimization of stochastic functions that are only accessible through ...[13] ... Voir le document complet

9

Approximate Policy Iteration for Generalized Semi-Markov Decision Processes: an Improved Algorithm

... To summarize, the improvement of the current policy is performed online: for each visited state starting in s0 we perform one Bellman backup using the value function evaluation from the[r] ... Voir le document complet

15

Distribution-based objectives for Markov Decision Processes

... ask for the existence of a strategy escaping the convex polytope H, or equivalently whether all strategies are safe (stay inside ...undecidability for PFA required both ... Voir le document complet

11

Approximate solution methods for partially observable Markov and semi-Markov decision processes

... fictitious processes. 3.5 Examples We give more examples of fictitious processes that correspond to several approximation schemes from the ...[ZH01] for discounted ...chosen for all initial ... Voir le document complet

169

Algorithmic aspects of mean–variance optimization in Markov decision processes

... directions for future research, which we briefly outline ...valid for specially structured ...the decision maker has to decide which MDP to activate, while the other MDPs remain ...(“index”) ... Voir le document complet

26

Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes

... states (otherwise it may not be possible to learn the optimal policy), but it excludes unreachable states (otherwise the resulting MDP would be non-communicating). This requires a considerable amount of prior knowledge ... Voir le document complet

28

DetH*: Approximate Hierarchical Solution of Large Markov Decision Processes

... 7 Conclusion There have been many adaptations to SPUDD, from the sug- gestions of optimization in the original paper [Hoey et al., 1999], to approximating value functions [St-Aubin et al., 2000], to using affine ADDs ... Voir le document complet

9

Markov Decision Petri Net and Markov Decision Well-Formed Net Formalisms

... formalisms, Markov Decision Petri Nets (MDPNs) and Markov Decision Well-formed Nets (MDWNs), useful for the modeling and analysis of distributed systems with probabilistic and non ... Voir le document complet

20

Bootstrap and uniform bounds for Harris Markov chains

... methods for dependent data such as moving block bootstrap (MBB), non-overlapping block bootstrap (NBB) or circular block bootstrap (CBB) to name just a few (see for instance [93] for an exhaustive ... Voir le document complet

203

Strong Uniform Value in Gambling Houses and Partially Observable Markov Decision Processes

... value for the infinitely repeated problem, namely the strong uniform ...that for any ǫ > 0, the decision-maker has a pure strategy σ which is ǫ-optimal in any n-stage problem, provided that n is ... Voir le document complet

25

A Learning Design Recommendation System Based on Markov Decision Processes

... options for teachers to use the Grasha-Reichmann Learning Styles Scales (GRLSS): by either designing instructional processes to accommodate particular styles, or by designing them in such a way that ... Voir le document complet

9

Pathwise uniform value in gambling houses and Partially Observable Markov Decision Processes

... the decision-maker chooses a probability on X which is compatible with the correspondence and the current ...POMDPs, for a finite state space and any action and signal ... Voir le document complet

25