[PDF] Top 20 Approximate Policy Iteration for Generalized Semi-Markov Decision Processes: an Improved Algorithm

Approximate Policy Iteration for Generalized Semi-Markov Decision Processes: an Improved Algorithm

... To summarize, the improvement of the current policy is performed online: for each visited state starting in s0 we perform one Bellman backup using the value function evaluation from the[r] ... Voir le document complet

15

Algorithmic aspects of mean–variance optimization in Markov decision processes

... literature. For example, (Guo, Ye, & Yin, 2012) consider a mean-variance optimization problem, but subject to a constraint on the vector of expected rewards starting from each state, which results in a simpler ... Voir le document complet

26

Approximate solution methods for partially observable Markov and semi-Markov decision processes

... approach for average cost POMDPs in fact grows out from the same approach for discounted ...convergence for the subgradient based cost approximation proposed in the same ...later improved by ... Voir le document complet

169

DetH*: Approximate Hierarchical Solution of Large Markov Decision Processes

... or approximate [St- Aubin et ...value iteration [Dai and Gold- smith, 2007; Dai et ...of an initial state to reduce the state space ...the algorithm output the optimal policy, we can ... Voir le document complet

9

Lightweight Verification of Markov Decision Processes with Rewards

... Network Virus Infection Our network virus infection case study is based on [21] and initially comprises the following sets of linked nodes: a set containing one node infected by a virus, a set with no infected nodes and ... Voir le document complet

16

PALMA, an improved algorithm for DOSY signal processing

... value for λ should be chosen from assumptions on the data based on explicit previous knowledge, however on a practical point, this is not ...tailored-ITAMeD algorithm, somewhat similar to λ and chose the ... Voir le document complet

34

An Efficient Algorithm for Cooperative Semi-Bandits

... receive semi-bandit feedback and exchange some succinct local ...resulting algorithm while retaining minimax optimal regret ...Leader algorithm, that implements a new loss estimation procedure ... Voir le document complet

23

Lexicographic refinements in possibilistic decision trees and finite-horizon Markov decision processes

... to decision under uncertainty have been considered by [ 15 – 20 ...qualitative decision theory ...the decision criteria are either the optimistic qualitative utility or its pessimistic ... Voir le document complet

26

Lexicographic refinements in possibilistic decision trees and finite-horizon Markov decision processes

... to decision under uncertainty have been considered by [ 15 – 20 ...qualitative decision theory ...the decision criteria are either the optimistic qualitative utility or its pessimistic ... Voir le document complet

27

Lexicographic refinements in stationary possibilistic Markov Decision Processes

... defines an extension of lexicographic refinements to finite horizon possibilistic Markov decision processes and proposes a value iteration algorithm that looks for ... Voir le document complet

22

The Class of Semi-Markov Accumulation Processes

... accumulation processes hold a very prominent role in many applications areas such as queueing theory, risk models, manufacturing, etc ...accumulation processes in the literature is the class of ... Voir le document complet

5

The approach in Markov decision processes revisited

... L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignemen[r] ... Voir le document complet

22

Asymptotic properties of constrained Markov decision processes

... L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignemen[r] ... Voir le document complet

22

Limits of Multi-Discounted Markov Decision Processes

... (see for example [15]), existence of pure stationary optimal strategies in discounted MDPs is used to show that the function which maps the dis- count factor to the value of the discounted MDP is a ra- tional ... Voir le document complet

13

Multi-criteria Search Algorithm: An Efficient Approximate K-NN Algorithm for Image Retrieval

... search for each descriptor of the query, which is problematic when a large number of descriptors per image is ...of an image to a single vector (called signature) have been proposed [9, ...search for ... Voir le document complet

6

On the Use of Non-Stationary Policies for Infinite-Horizon Discounted Markov Decision Processes

... value-iteration algorithm for approximately computing the value of some fixed policy, and for which one can prove a dependency of the form ... Voir le document complet

5

A semi-Lagrangian algorithm in policy space for hybrid optimal control problems

... (4.1)–(4.2) for respectively Q(t) = 1 and Q(t) = 2, and we have to minimize a combination between the growth of the tumor mass and the toxic effect of the drug on healthy cells (note that this latter term appears ... Voir le document complet

20

Applications of Markov Decision Processes in Communication Networks : a Survey

... 101 - 54602 Villers lès Nancy Cedex France Unité de recherche INRIA Rennes : IRISA, Campus universitaire de Beaulieu - 35042 Rennes Cedex France Unité de recherche INRIA Rhône-Alpes : 65[r] ... Voir le document complet

55

Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes

... optimal policy), but it excludes unreachable states (otherwise the resulting MDP would be ...has an hole in the middle is a valid state (e.g., as an initial state) but it cannot be observed/reached ... Voir le document complet

28

A Learning Design Recommendation System Based on Markov Decision Processes

... options for teachers to use the Grasha-Reichmann Learning Styles Scales (GRLSS): by either designing instructional processes to accommodate particular styles, or by designing them in such a way that ... Voir le document complet

9