A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs

DOI10.14736/kyb-2019-1-0081zbMath1449.90356OpenAlexW2921102909MaRDI QIDQ5227201

Óscar Vega-Amaya, Joaquín López-Borbón

Publication date: 5 August 2019

Published in: Kybernetika (Search for Journal in Brave)

Full work available at URL: http://hdl.handle.net/10338.dmlcz/147707

zbMATH Keywords

Markov decision processes approximate value iteration algorithm average cost criterion contraction and non-expansive operators perturbed Markov decision models

Mathematics Subject Classification ID

Approximation methods and heuristics in mathematical programming (90C59) Optimal stochastic control (93E20) Markov and semi-Markov decision processes (90C40)

Cites Work

Unnamed Item
Unnamed Item
Unnamed Item
Unnamed Item
Unnamed Item
Unnamed Item
Unnamed Item
Unnamed Item
Unnamed Item
Unnamed Item
A perturbation approach to a class of discounted approximate value iteration algorithms with Borel spaces
Perspectives of approximate dynamic programming
Approximation of Markov decision processes with general state space
Markov chains and stochastic stability
On the optimality equation for average cost Markov control processes with Feller transition probabilities
Continuous state dynamic programming via nonexpansive approximation
Discretization procedures for adaptive Markov control processes
Application of average dynamic programming to inventory systems
The approximation of continuous functions by positive linear operators
Approximate receding horizon approach for Markov decision processes: average reward case
The average cost optimality equation: a fixed point approach
On the existence of fixed points for approximate value iteration and temporal-difference learning
Solutions of the average cost optimality equation for Markov decision processes with weakly continuous kernel: the fixed-point approach revisited
A reinforcement learning algorithm based on policy iteration for average reward: Empirical results with yield management and convergence analysis
Markov chains and invariant probabilities
Simulation-based algorithms for Markov decision processes
Stochastic approximations of constrained discounted Markov decision processes
A note on a variation of Doeblin's condition for uniform ergodicity of Markov chains
Recurrence conditions for Markov decision processes with Borel state space: A survey
Learning Algorithms for Markov Decision Processes with Average Cost
Approximate policy iteration: a survey and some new methods
A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
Estimate and approximate policy iteration algorithm for discounted Markov decision models with bounded costs and Borel spaces
Approximate Fixed Point Iteration with an Application to Infinite Horizon Markov Decision Processes
Pseudometrics for State Aggregation in Average Reward Markov Decision Processes
What you should know about approximate dynamic programming
OnActor-Critic Algorithms
Zero-Sum Average Semi-Markov Games: Fixed-Point Solutions of the Shapley Equation
CONVERGENCE OF SIMULATION-BASED POLICY ITERATION
Analysis of a Numerical Dynamic Programming Algorithm Applied to Economic Models
A generalization of Ueno's inequality for n-step transition probabilities
On the Asymptotic Optimality of Finite Approximations to Markov Decision Processes with Borel Spaces
Convergence Properties of Policy Iteration
Discrete-Time Controlled Markov Processes with Average Cost Criterion: A Survey
Sample-Path Optimality and Variance-Minimization of Average Cost Markov Control Processes
Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities
Approximate Dynamic Programming
Performance Loss Bounds for Approximate Value Iteration with State Aggregation
A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees
Performance Bounds in $L_p$‐norm for Approximate Value Iteration

This page was built for publication: A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs