scientific article
From MaRDI portal
Publication:3912356
zbMath0462.90055MaRDI QIDQ3912356
No author found.
Publication date: 1981
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
strong convergenceMarkov decision processesstochastic dynamic programmingsuccessive approximationsaverage reward criterionLiapunov functionstotal reward criterionnearly optimal strategiespolicy space algorithmsconservingnessepsilon-optimal stationary Markov strategiesgo-ahead strategiestwo person zero-sum Markov gamesvalue algorithms
Stochastic programming (90C15) 2-person games (91A05) Dynamic programming (90C39) Other game-theoretic models (91A40)
Related Items
Asymptotic expansions for dynamic programming recursions with general nonnegative matrices, On Nash equilibrium solutions in nonzero-sum stochastic games with complete information, Algorithms for stochastic games ? A survey, The numerical exploitation of periodicity in Markov decision processes, Repair limit replacement, Non-homogeneous Markov decision processes with a constraint, Regular Policies in Abstract Dynamic Programming, The transformation method for continuous-time Markov decision processes, A value iteration method for undiscounted multichain Markov decision processes, Stochastic dynamic programming with non-linear discounting, Optimal strategies for some team games, Markov decision processes, Note on discounted continuous-time Markov decision processes with a lower bounding function, Solving infinite horizon discounted Markov decision process problems for a range of discount factors, Constrained discounted Markov decision processes with Borel state spaces, Partially observable game-theoretic agent programming in Golog, Mean, variance and probabilistic criteria in finite Markov decision processes: A review, A Fenchel-Moreau-Rockafellar type theorem on the Kantorovich-Wasserstein space with applications in partially observable Markov decision processes, On Convergence of Value Iteration for a Class of Total Cost Markov Decision Processes, On the reduction of total‐cost and average‐cost MDPs to discounted MDPs, Control of arrivals to two queues in series