[PDF] Top 20 The steady-state control problem for Markov decision processes
Has 10000 "The steady-state control problem for Markov decision processes" found on our website. Below are the top 20 most common "The steady-state control problem for Markov decision processes".
The steady-state control problem for Markov decision processes
... defined the steady-state control problem for MDP, and shown that this question is decidable for (ergodic) MDP in polynomial time, and for labeled MDP in polynomial ... Voir le document complet
17
Lightweight Verification of Markov Decision Processes with Rewards
... with the specified confidence. In the case of estima- tion, the specified confidence is with respect to the estimate, not the ...Quantifying the optimality of estimates is an ... Voir le document complet
16
A Learning Design Recommendation System Based on Markov Decision Processes
... on the idea that the end-user would have a good knowledge of the IMS-LD; however, experience has shown that the use of such tools was not easy and in the end did not meet the ... Voir le document complet
9
Constrained Markov Decision Processes with Total Expected Cost Criteria
... pute the optimal value and an optimal stationary policy for ...quired the strong assumption that s(β, u) is finite for any ...excludes the shortest path problem in which policies ... Voir le document complet
3
Lexicographic refinements in possibilistic decision trees and finite-horizon Markov decision processes
... possibilistic decision making, a decision can be seen as a possibility distribution over a finite set of out- comes [ 20 ...stage decision making problem, a utility function maps each outcome ... Voir le document complet
27
Lexicographic refinements in possibilistic decision trees and finite-horizon Markov decision processes
... possibilistic decision making, a decision can be seen as a possibility distribution over a finite set of out- comes [ 20 ...stage decision making problem, a utility function maps each outcome ... Voir le document complet
26
Towards Control of Steady State Plasma on Tore Supra
... E. Plasma pulse termination control Disruptions are a major problem for tokamaks operation. During such event, forces up to hundred tons can be applied to structures and a significant fraction of ... Voir le document complet
7
Bayesian state estimation in partially observable Markov processes
... . The CGHF implements the Gauss-Hermite quadrature technique specied in Ap- pendix B to compute these integrals in the case where the exact solution is ...summarize, the PF ...on ... Voir le document complet
149
On the Use of Non-Stationary Policies for Infinite-Horizon Discounted Markov Decision Processes
... that the problem of “computing an approximately optimal policy” is significantly harder than that of “approximately computing the value of some fixed ...in the literature that supports ... Voir le document complet
5
DetH*: Approximate Hierarchical Solution of Large Markov Decision Processes
... in the domain whose values were not well- connected. We use the check on line 10 to identify vari- ables with large numbers of values that cannot reach each ...other. The partition formed on line 9 ... Voir le document complet
9
Decentralized control of Partially Observable Markov Decision Processes using belief space macro-actions
... observable Markov decision problems. In Proceedings of the Sixteen Conference on Artificial Intelligence (AAAI), pages 541–548, ...Decentralized control of partially observable markov ... Voir le document complet
9
Decentralized Control of Partially Observable Markov Decision Processes Using Belief Space Macro-Actions
... observable Markov decision problems. In Proceedings of the Sixteen Conference on Artificial Intelligence (AAAI), pages 541–548, ...Decentralized control of partially observable markov ... Voir le document complet
9
Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes
... cause the span to become too large. Unfortunately, an accurate knowledge of the bias span may not be easier to obtain than designing a well-specified state ...on the doubling trick [15] to ... Voir le document complet
28
Steady-state and periodic exponential turnpike property for optimal control problems in Hilbert spaces
... of the present paper is to establish the exponential turnpike property for gen- eral infinite-dimensional nonlinear optimal control problems under exponential stabilizability and detectability ... Voir le document complet
27
Decentralized control of multi-robot systems using partially observable Markov Decision Processes and belief space macro-actions
... Dec-POMDP problem stated in Eq. (2.12) is undecidable (as is the infinite- horizon POMDP problem even in discrete settings ...exist for problems with continuous state ...extended ... Voir le document complet
139
Planning in Markov Decision Processes with Gap-Dependent Sample Complexity
... intended for the case (BK) H −1 SA, which does not hold in our experiments (despite the large state ...in the bandit ...results, for their scaling in n t h (s, a), un-doing a ... Voir le document complet
25
Limits of Multi-Discounted Markov Decision Processes
... computation, the class of MDPs with pure stationary optimal strategies has good algorithmic properties: under weak hypothesis, the values of these MDPs are ...about the computability of φ: we suppose ... Voir le document complet
13
Algorithmic aspects of mean–variance optimization in Markov decision processes
... of the cumulative past ...function, state augmentation is unnecessary, and optimal policies can be found by solving a modified Bellman equation (Chung & Sobel, ...surrogate for trading off mean ... Voir le document complet
26
A Markov Decision Evolutionary Game for the study of a Dynamic Hawk and Dove Problem
... current state. For each given initial state x of a player and each policy u of that player, which we call a tagged player, if the rest of the population uses some common stationary ... Voir le document complet
15
Approximate Policy Iteration for Generalized Semi-Markov Decision Processes: an Improved Algorithm
... To summarize, the improvement of the current policy is performed online: for each visited state starting in s0 we perform one Bellman backup using the value function evaluation from the[r] ... Voir le document complet
15
Sujets connexes