A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs
DOI10.14736/kyb-2019-1-0081zbMath1449.90356OpenAlexW2921102909MaRDI QIDQ5227201
Óscar Vega-Amaya, Joaquín López-Borbón
Publication date: 5 August 2019
Published in: Kybernetika (Search for Journal in Brave)
Full work available at URL: http://hdl.handle.net/10338.dmlcz/147707
Markov decision processesapproximate value iteration algorithmaverage cost criterioncontraction and non-expansive operatorsperturbed Markov decision models
Approximation methods and heuristics in mathematical programming (90C59) Optimal stochastic control (93E20) Markov and semi-Markov decision processes (90C40)
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- A perturbation approach to a class of discounted approximate value iteration algorithms with Borel spaces
- Perspectives of approximate dynamic programming
- Approximation of Markov decision processes with general state space
- Markov chains and stochastic stability
- On the optimality equation for average cost Markov control processes with Feller transition probabilities
- Continuous state dynamic programming via nonexpansive approximation
- Discretization procedures for adaptive Markov control processes
- Application of average dynamic programming to inventory systems
- The approximation of continuous functions by positive linear operators
- Approximate receding horizon approach for Markov decision processes: average reward case
- The average cost optimality equation: a fixed point approach
- On the existence of fixed points for approximate value iteration and temporal-difference learning
- Solutions of the average cost optimality equation for Markov decision processes with weakly continuous kernel: the fixed-point approach revisited
- A reinforcement learning algorithm based on policy iteration for average reward: Empirical results with yield management and convergence analysis
- Markov chains and invariant probabilities
- Simulation-based algorithms for Markov decision processes
- Stochastic approximations of constrained discounted Markov decision processes
- A note on a variation of Doeblin's condition for uniform ergodicity of Markov chains
- Recurrence conditions for Markov decision processes with Borel state space: A survey
- Learning Algorithms for Markov Decision Processes with Average Cost
- Approximate policy iteration: a survey and some new methods
- A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
- Estimate and approximate policy iteration algorithm for discounted Markov decision models with bounded costs and Borel spaces
- Approximate Fixed Point Iteration with an Application to Infinite Horizon Markov Decision Processes
- Pseudometrics for State Aggregation in Average Reward Markov Decision Processes
- What you should know about approximate dynamic programming
- OnActor-Critic Algorithms
- Zero-Sum Average Semi-Markov Games: Fixed-Point Solutions of the Shapley Equation
- CONVERGENCE OF SIMULATION-BASED POLICY ITERATION
- Analysis of a Numerical Dynamic Programming Algorithm Applied to Economic Models
- A generalization of Ueno's inequality for n-step transition probabilities
- On the Asymptotic Optimality of Finite Approximations to Markov Decision Processes with Borel Spaces
- Convergence Properties of Policy Iteration
- Discrete-Time Controlled Markov Processes with Average Cost Criterion: A Survey
- Sample-Path Optimality and Variance-Minimization of Average Cost Markov Control Processes
- Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities
- Approximate Dynamic Programming
- Performance Loss Bounds for Approximate Value Iteration with State Aggregation
- A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees
- Performance Bounds in $L_p$‐norm for Approximate Value Iteration
This page was built for publication: A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs