Computing a Bias-Optimal Policy in a Discrete-Time Markov Decision Problem
From MaRDI portal
Publication:5591254
DOI10.1287/opre.18.2.279zbMath0195.21101OpenAlexW2109693562WikidataQ113239967 ScholiaQ113239967MaRDI QIDQ5591254
Publication date: 1970
Published in: Operations Research (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1287/opre.18.2.279
Related Items
Temporal logics for the specification of performance and reliability ⋮ Communicating MDPs: Equivalence and LP properties ⋮ Generalized Markovian decision processes ⋮ Survey of linear programming for standard and nonstandard Markovian control problems. Part I: Theory ⋮ A policy improvement method for constrained average Markov decision processes ⋮ An optimality principle for Markovian decision processes ⋮ Solving stochastic dynamic programming problems by linear programming — An annotated bibliography ⋮ Markov Branching Decision Chains with Interest-Rate-Dependent Rewards ⋮ A new optimality criterion for discrete dynamic programming